A while back, Phil Shaw asked, on the RSS-Dev mailing list, if anyone knew who was responsible for a user agent making consistently bad requests:
“I have noticed an RSS client in my server logs that is causing a nuisance and wondered if anyone else had seen it. It only requests URLs of published RSS feeds but adds a trailing underscore on the end, resulting in 404 errors for every request.”
A quick check now:
$ zcat access.log.*.gz | fgrep ' 404 ' | fgrep .rss_ | cut -f1 -d\ | wc -l 369 $ zcat access.log.*.gz | fgrep .rss_ | cut -f1 -d\ | sort -u 24.16.107.112 $ zcat access.log.*.gz | fgrep 24.16.107.112 | fgrep -v ' 404 ' | wc -l 0
So that’s a few hundred requests, all from the same IP address, and all of
them
404s. It’s not as if it’s a drain on resources, but – why? What
is this thing? Do they know it’s not working? Is this a bug?
(Could they be using a language with a $_
variable, and have
accidentally doubled the underscore?)
My favourite Bad User Agent quote: “i’ve been feeding one jerk a variety of 403 and other errors for over two years.” It’s testament to the robustness of the web that this kind of thing is only a problem for the implacably curious....