From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Nov 07 2003 - 19:32:59 EST
From: "Doug Ewell" <dewell@adelphia.net>
> Philippe Verdy wrote (in rich text):
>
> > Due to that, an application needs to specify whever it will support
> > and comply with the full ISO/IEC 10646-1:2000 character set or to the
> > Unicode subset.
>
> ISO/IEC 10646 has reduced its range to match Unicode's, so this
> distinction is obsolete.
It is not obsolete: the corrigendum #1 for UTF-8 (published in Unicode 4.0)
refers to ISO/IEC 10646-1:2000, not to ISO/IEC 10646:2003 which is the
character repertoire which corresponds to Unicode 4.0...
So that's a reference error in the version of the now normative corrigendum
published in Unicode 4.0...
Does it need another Corrigendum to correct this reference in the
Corrigendum?
Well, I still doubt that ISO/IEC 10646 has reduced its character set. It has
just agreed to limit its repertoire of _standardized_ and _interchangeable_
characters to the first 17 planes so that _these_ characters can remain in
sync and encoded identically in the Unicode repertoire with the same code
points, but all the other planes are still present in ISO/IEC 10646, some of
them being still allocated to PUAs that don't have equivalents in Unicode,
but they are still valid within UTF-8 encoded data and also still conforming
to ISO/IEC 10646 (even if they are illegal for use in Unicode 4.0, these
sequences are not ill-formed like non shortest forms now forbidden in both
standards).
This archive was generated by hypermail 2.1.5 : Fri Nov 07 2003 - 20:09:46 EST