web scraping - How can a webpage be made such that they can not be scraped by bots? -
this question has developed off answer here.
my question therefore steps can 1 take wend off standard scrapers?
in addition previous mentions of robots.txt, robots meta tag, , using more javascript, 1 of sure methods know of put restricted content behind user login. limit purpose-built bots. add strong captcha (like recaptcha) user login , purpose-built bots blocked too.
if site looking verify identity of client (ie: including whether it's bot), that's user-logins for. :)
user login's can disabled if strange activity detected.
Comments
Post a Comment