From: Norbert Lindenberg (Norbert.Lindenberg@Sun.COM)
Date: Mon Nov 15 2004 - 12:20:31 CST
Theodore,
Thank you for your feedback. Adding a warning to the description in
DataInput sounds like a good idea. In the meantime, if somebody wants
to use modified UTF-8 outside the Java context, please point them to
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
index.html#Modified_UTF-8
Unfortunately, since this encoding is widely used within the Java
context, we can not deprecate it.
Best regards,
Norbert
Java Internationalization
java.sun.com/j2se/corejava/intl
Theodore H. Smith wrote:
> I take your point that you are well aware of this. However some of your
> users are not so aware, having read your information on "Modified
> UTF-8" and thinking "hey, well is Sun do it, then it must be OK for me
> to do it too!"
>
> This thread, was inspired by exactly that. Someone point me to this
> page, using it as "proof" that modified UTF-8 is an acceptable thing to
> do.
>
> While you are well aware, the users aren't. I think it would be a good
> idea to add a small note saying that this feature is going to be
> changed in future versions of Java, or perhaps Deprecated, due to its
> incompatibility. Just a small note, on that page and similar pages,
> with the phrase "This will be deprecated in the future because it
> currently contradicts the standard behaviour"... that would make a
> *huge* difference.
>
> That aside.
>
> I'm just curious about the \0 thing. What problems would having a \0 in
> UTF-8 present, that are not presented by having \0 in ASCII? I can't
> see any advantage there.
>
> The only advantage I can imagine, would be using UTF-8 for storing \0
> in places that previously weren't possible. To me, that sounds like a
> strange way to add a feature.
>
> On 12 Nov 2004, at 23:58, A. Vine wrote:
>
>> FYI, we are well aware of this shortcoming (modified UTF-8), and with
>> each release try to mitigate it even further. The problem is that it
>> is so deep in the code (note that it is since Java 1.0) that it is
>> not easy to eliminate without breaking a lot of existing stuff,
>> something that the Java team strive to avoid.
>>
>> Theodore H. Smith wrote:
>>
>>> http://java.sun.com/j2se/1.5.0/docs/api/java/io/
>>> DataInput.html#modified-utf-8
>>> If only people could sue for suggesting bad coding practices ;o)
>>> --
>>> Theodore H. Smith - Software Developer.
>>> http://www.elfdata.com
This archive was generated by hypermail 2.1.5 : Mon Nov 15 2004 - 12:24:20 CST