Re: Java char and Unicode 3.0+ (was:Canonical equivalence in rendering: mandatory or recommended?)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Oct 16 2003 - 07:33:24 CST


From: "John Cowan" <cowan@mercury.ccil.org>

> Philippe Verdy scripsit:
>
> > I am also doubting, but I would not bet on it. After all, when Unicode
> > started, a single plane was considered waaaaaay more than sufficient
too.
>
> I not only would bet on it, I actually have a bet on it. Henry Thompson
> of the W3C's Schema WG bet me that we'd outrun the existing planes within
> five years; four left to go and no sign of it, even if Michael Everson
> were to achieve pluripresence and actually get everything accepted into
> the standard that he knows needs to be done.

Just for the case it would be needed, are you keeping an unassigned range
in the BMP so that extension will remain possible to preserve an ascending
compatibility or support for UTF-16 which currently is the main reason why
there are for now 17 planes defined ?
(for example in the range between Hangul syllables and existing surrogates)

That's OK not to document is officially for now, but it seems that a prudent
and conservative policy to keep such a range available in the BMP
for the future is needed. Of course, if there's an evolution, this would
require a later update to the current UTF-8 and UTF-16 conforming rules.

I'm not asking to document it now, but to keep it in mind and not fully
filling the BMP so that UTF-16 would become impossible to upgrade to
the possible future scheme (such provisions already exist natively in UTF-8
and UTF-32, since its origin by X/Open and their initial documentation in
a RFC).



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST