Re: SGML DESCSET for XML, HTML (was: XML and ISO 10646 ...)

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 12 1997 - 17:11:44 EDT


> -- This DESCSET for UCS-4 fails to record that FFFE and FFFF on
> the non-UTF-16 planes are also non-characters.

By the way, I don't know whose goofy idea it was to proscribe FFFE and
FFFF on planes other than Plane 0 in ISO/IEC 10646. The *only* truly
proscribed values should be U+FFFE and U+FFFF (or U-0000FFFE and
U-0000FFFF). Eliminating those values on other planes creates
the same kind of swiss cheese effect in the encoding that we struggled
so hard to eliminate when fighting the battle of the NULL's in the
BMP.

For those of you out there who may still be confusing ISO/IEC 10646
with an EUC encoding, the impact of the proscription of FFFE and FFFF
on the DESCSET declaration for XML should be instructive.

--Ken Whistler

Help stamp out numerologists on standard committees!



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT