On 03/01/13 11:32, Simon wrote:
I have read some great stuff on
perishablepress.com
about blocking
misguided/bad robots, using apache and mod rewrite if I remember
correctly.
Thanks for the mention. I think that this is what you meant:
http://perishablepress.com/blackhole-bad-bots/
Basic idea:
(1) Create a "blackhole" directory, and add it to robots.txt
(2) Add hidden links in your page header or footer that points to the
blackhole directory as a trap for bad bots. The trick is that a bot will
follow this link if they don't honour the robots.txt entry, whereas a
person won't see the link.
(3) Have a PHP script harvest the bot's details and put it in a data file
(4) Have mod_rewrite (via .htaccess) or something similar block
connections from sites listed in the data file.
Clever idea.
--
Regards,
Jan Henkins