Remember the good old days of C++, when any system of moderate size
would have at least three distinct string classes, each with its own
assumptions and failings? Converting between them was never fun, but
invariably essential, as each API required a specific type. It’s reassuring
to see these problems repeat: for a very small Java project I’ve brought in a
single class library, and now have four distinct URI classes.
(That’s not a contrived example, and I’m not including internal classes. During
development, that would have pushed the total past eight.)
Of course, they all have limitations:
java.net.URL doesn’t handle opaque URIs,
java.net.URI gets the rules for normalisation, equality and
relative resolution wrong, and breaks some cases that java.net.URL
used to cover. Custom implementations tend to ignore the less interesting
parts of the spec – IPv6 addresses, maybe Unicode – and everyone has
their own opinion about file URIs.
The only real way to convert between them is via strings, throwing away static typing in the process and increasing the risk of falling foul of a substandard implementation. This makes RDF, based as it is around URI equivalence, harder than it needs to be.
RFC 2396bis is apparently now final, but I doubt that’s going to stop the spider authors who thing that URI processing is just concatenation and wishing. The tag URI spec is going through channels currently, and provides one of the best-written foundations for universal addressing that I’ve read.
Fifteen-second exposure:
Move around, you freaks!