Bro, ban me at the IP level if you don't like me

105 points by classichasclass


borisk

Yes, I’ve seen this one in our logs. Quite obnoxious, but at least it identifies itself as a bot and, at least in our case (cgit host), does not generate much traffic. The bulk of our traffic comes from bots that pretend to be real browsers and that use a large number of IP addresses (mostly from Brazil and Asia in our case).

I’ve been playing cat and mouse trying to block them for the past week and here are a couple of observations/ideas, in case this is helpful to someone:

My conclusion from this experience is that you really only have two options: either do something ad hoc, very specific to your site (like the notbot in query string) that whoever runs the bots won’t bother adapting to or you have to employ someone with enough resources (like Cloudflare) to fight them for you. Using some “standard” solution (like rate limit, Anubis, etc) is not going to work – they have enough resources to eat up the cost and/or adapt.