From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Apr 21 2006 - 17:50:36 CST
> From: "Richard Wordingham" <richard.wordingham@ntlworld.com>
> > I think it would be prudent to reserve a
> > surrogate plane for fifty years if using sequences of three surrogates
> > (high-high-low and high-low-low) to extend UTF-16 is unacceptable.
> From: "Philippe Verdy" <verdy_p@wanadoo.fr>
> I already proposed to keep the few code points that are just
> between hangul syllables and existing surrogates, unassigned
> until further notice (U+D7B0..U+D7FF), to create new types of
> surrogates, should they ever become necessary in some long
> future. ...
[ long elaborated speculative hyper-surrogate scheme omitted ]
I'm not sure how many times I will have to debunk this stuff,
but here goes again.
Currently the UTC and WG2, which are the *only* committees that
can add encoded characters to the Unicode Standard and to 10646,
have been adding characters at a rate of roughly 1500
characters per year. There is no reason to believe that they
will suddenly increase the rate at which they add characters
to the standard. True, there are a few remaining known large
repertoires out there, including Egyptian hieroglyphics and
Chinese seal form characters, but work on those is slow, and
the numbers are in the low thousands, in any case, not in the
untold millions.
As of Unicode 5.0, there are 875,441 unassigned, reserved code
points in the standard.
875,441 available code points / 1500 characters encoded per year
= 583 years
And that is assuming that the committees will continue to encode
1500 characters per year indefinitely, which they won't.
And that is assuming that Unicode itself will last for more than
500 years, which it won't.
Would you *PLEASE* stop worrying about running out of code points
and dreaming up speculative, unnecessary kinds of surrogate
extension mechanisms and spreading them on the lists.
--Ken
This archive was generated by hypermail 2.1.5 : Fri Apr 21 2006 - 17:52:12 CST