Re: Java and Unicode

From: Michael \(michka\) Kaplan (
Date: Wed Nov 15 2000 - 10:43:40 EST

I do not think they are so theoretical, with both 10646 and Unicode
including them in the very new future (unless you count it as theoretical
when you drop an egg but it has not yet hit the ground!).

In any case, I think that UTF-16 is the answer here.

Many people try to compare this to DBCS, but it really is not the same
thing.... understanding lead bytes and trail bytes in DBCS is *astoundingly*
more complicated than handling surrogate pairs.


a new book on internationalization in VB at

----- Original Message -----
From: "Elliotte Rusty Harold" <>
To: "Unicode List" <>
Sent: Wednesday, November 15, 2000 6:15 AM
Subject: Re: Java and Unicode

> One thing I'm very curious about going forward: Right now character
> values greater than 65535 are purely theoretical. However this will
> change. It seems to me that handling these characters properly is
> going to require redefining the char data type from two bytes to
> four. This is a major incompatible change with existing Java.
> There are a number of possibilities that don't break backwards
> compatibility (making trans-BMP characters require two chars rather
> than one, defining a new wchar primitive data type that is 4-bytes
> long as well as the old 2-byte char type, etc.) but they all make the
> language a lot less clean and obvious. In fact, they all more or less
> make Java feel like C and C++ feel when working with Unicode: like
> something new has been bolted on after the fact, and it doesn't
> really fit the old design.
> Are there any plans for handling this?
> --
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> | The XML Bible (IDG Books, 1999) |
> | |
> | |
> +----------------------------------+---------------------------------+
> | Read Cafe au Lait for Java news: |
> | Read Cafe con Leche for XML news: |
> +----------------------------------+---------------------------------+

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT