At perhaps a rate of a dozen a week or so, we get errors from our web site about invalid URLs. They always look like this:
Alright, so here’s the funny part. That’s not a page on our site, and there’s no link anywhere on our site to that URL. There’s no referer header, so I have to guess that Squid is scraping it out, but it doesn’t really matter. See, here is the link we have:
See the difference? That’s right, no trailing slash. Apparently, the Squid authors were so smart that they unilaterally decided that any URL that doesn’t have an extension on the end of it obviously must mean it’s a directory. Obviously. As if / even necessarily means that there’s a directory on a file system involved somewhere (in our case, there isn’t, as all pages are generated without regard to any files in the filesystem, as is perfectly legal).
Nicely done. This is the kind of stuff that drives me nutty… people who have a much higher opinion of their intellect than they clearly deserve to have, making “clever” leaps of “logic” and destroying what should be a relatively simple thing: URLS ARE OPAQUE. STOP MESSING WITH THEM.