Re: Opinions on this Java URL?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Nov 13 2004 - 21:41:32 CST

Next message: Doug Ewell: "Re: Opinions on this Java URL?"

Previous message: Philippe Verdy: "Re: Opinions on this Java URL?"
In reply to: Doug Ewell: "Re: Opinions on this Java URL?"
Next in thread: Doug Ewell: "Re: Opinions on this Java URL?"
Reply: Doug Ewell: "Re: Opinions on this Java URL?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

From: "Doug Ewell" <dewell@adelphia.net>

> What is a shame is that Unicode published a definition of the defective
> CESU-8 at all.

On that point at least we agree. I wonder why CESU-8 was created, if there
effectively exists applications needing it.

On the other side, the Java modified UTF-8 (in fact more near from CESU-8)
has proven to be useful and is widely used... Simply because it is
compatible with standard C libraries for null-terminated strings. It's
historic and lived well with Unicode, given the previous tolerance in legacy
UTF-8 decoders. Even today, it is still conforming with Unicode rules, given
that Java does not pretend that this is UTF-8 and does not label encoded
data as being UTF-8 -- it is used internally in Java JNI interfaces or in
the Java class file format which is not plain-text, and both are part of the
JVM specifications and not intended for data interchange between distinct
hosts or applications).

But the tolerance for non-shortest forms effectively existed, so that C0,80
would be interpreted safely as NUL (U+0000).

Another way to think about the Java modified UTF-8 is that it could be a
transport encoding syntax for CESU-8 (from which it differs mostly by
escaping null bytes into two bytes C0,80 where the leading byte C0 is not
used in CESU-8, and by supporting the presence of isolated/unpaired
surrogates or invalid UTF-16 code units in the CESU-8 scheme-encoded
string). So why would Sun change something there? Changing something that
works with a new API that will create incompatibilities does not look like a
good thing.

Next message: Doug Ewell: "Re: Opinions on this Java URL?"
Previous message: Philippe Verdy: "Re: Opinions on this Java URL?"
In reply to: Doug Ewell: "Re: Opinions on this Java URL?"
Next in thread: Doug Ewell: "Re: Opinions on this Java URL?"
Reply: Doug Ewell: "Re: Opinions on this Java URL?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Nov 13 2004 - 21:43:55 CST