From: Theodore H. Smith (delete@elfdata.com)
Date: Fri Nov 12 2004 - 18:05:20 CST
I take your point that you are well aware of this. However some of your
users are not so aware, having read your information on "Modified
UTF-8" and thinking "hey, well is Sun do it, then it must be OK for me
to do it too!"
This thread, was inspired by exactly that. Someone point me to this
page, using it as "proof" that modified UTF-8 is an acceptable thing to
do.
While you are well aware, the users aren't. I think it would be a good
idea to add a small note saying that this feature is going to be
changed in future versions of Java, or perhaps Deprecated, due to its
incompatibility. Just a small note, on that page and similar pages,
with the phrase "This will be deprecated in the future because it
currently contradicts the standard behaviour"... that would make a
*huge* difference.
That aside.
I'm just curious about the \0 thing. What problems would having a \0 in
UTF-8 present, that are not presented by having \0 in ASCII? I can't
see any advantage there.
The only advantage I can imagine, would be using UTF-8 for storing \0
in places that previously weren't possible. To me, that sounds like a
strange way to add a feature.
On 12 Nov 2004, at 23:58, A. Vine wrote:
> FYI, we are well aware of this shortcoming (modified UTF-8), and with
> each release try to mitigate it even further. The problem is that it
> is so deep in the code (note that it is since Java 1.0) that it is not
> easy to eliminate without breaking a lot of existing stuff, something
> that the Java team strive to avoid.
>
> Theodore H. Smith wrote:
>
>> http://java.sun.com/j2se/1.5.0/docs/api/java/io/
>> DataInput.html#modified-utf-8
>> If only people could sue for suggesting bad coding practices ;o)
>> --
>> Theodore H. Smith - Software Developer.
>> http://www.elfdata.com
>
>
This archive was generated by hypermail 2.1.5 : Fri Nov 12 2004 - 18:11:51 CST