Re: Java and Unicode

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Wed Nov 15 2000 - 20:02:45 EST

Next message: Mark Davis: "Re: [idn] Javascript code charts, unicode converter, show-characters"
Previous message: AvaFonts@aol.com: "Fwd: Changes proposed for Tamil"
Maybe in reply to: Jani Kajala: "Java and Unicode"
Next in thread: Elliotte Rusty Harold: "Re: Java and Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Please let's keep types for single characters and types for strings separate.

ICU used to be in the same situation as Java: everything character/string used 16-bit types.
In extension to UTF-16, we decided to keep the string base type at 16 bits for very good reasons like interoperability and memory consumption.
For single characters, ICU changed APIs from 16-bit to 32-bit types.

In the case of Java, the equivalent course of action would be to stick with a 16-bit char as the base type for strings. The int type could be used in _additional_ APIs for single Unicode code points, deprecating the old APIs with char.

Whatever Sun decides to do with single characters, it will be most reasonable to keep the string encoding the same and just treat it as UTF-16 where that makes a difference.

For details, see my presentation at the IUC 17 Unicode conference (2000 September, session B2).
(See http://www.unicode.org/ - I am having some trouble with web access right now, so I cannot give you the URL...)

markus

Next message: Mark Davis: "Re: [idn] Javascript code charts, unicode converter, show-characters"
Previous message: AvaFonts@aol.com: "Fwd: Changes proposed for Tamil"
Maybe in reply to: Jani Kajala: "Java and Unicode"
Next in thread: Elliotte Rusty Harold: "Re: Java and Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT