Underscores, “Not Found”


A while back, Phil Shaw asked, on the RSS-Dev mailing list, if anyone knew who was responsible for a user agent making consistently bad requests:

“I have noticed an RSS client in my server logs that is causing a nuisance and wondered if anyone else had seen it. It only requests URLs of published RSS feeds but adds a trailing underscore on the end, resulting in 404 errors for every request.”

A quick check now:

$ zcat access.log.*.gz | fgrep ' 404 ' | fgrep .rss_ | cut -f1 -d\ | wc -l
$ zcat access.log.*.gz | fgrep .rss_ | cut -f1 -d\  | sort -u
$ zcat access.log.*.gz | fgrep | fgrep -v ' 404 ' | wc -l

So that’s a few hundred requests, all from the same IP address, and all of them 404s. It’s not as if it’s a drain on resources, but – why? What is this thing? Do they know it’s not working? Is this a bug? (Could they be using a language with a $_ variable, and have accidentally doubled the underscore?)

My favourite Bad User Agent quote: “i’ve been feeding one jerk a variety of 403 and other errors for over two years.” It’s testament to the robustness of the web that this kind of thing is only a problem for the implacably curious....

(Music: The Misfits, “Skulls”)
(More from this year, or the front page? [K])