From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Nov 14 2003 - 16:41:16 EST
From: "Kenneth Whistler" <kenw@sybase.com>
> Please disregard Philippe's misleading blatherings on this
> topic.
Thanks for denying all what I say, before finally saying the same thing as
me...
What I have said is not blathering, when I just say that UTR 20 is good only
in the context of text with layout or rendering markup.
When I answered it was for a generic use of XML, which does not imply any
markup related to the text content. So I have concluded exactly like you,
that what was important was the effect of the canonical normalization,
excluding other "compatibility" characteristic of any Unicode character that
has a non-canonical decomposition.
I still maintain that the first accurate list of compatibility exists, and
it is not in UTR20, but in the composition exclusion list data file (which
has comments for singleton canonical decomposition mappings) used in
combination of the decomposition column of the UCD.
If you want a good and _normative_ UTR, better look in UAX #15 (Unicode
Normalization Forms), rather than UTR20 which is only _informative_ for both
the XML and Unicode standards, and that, for me, is nearly useless out of
some specific XML schemas like XHTML.
As Alexandre had not specified a text markup context in his question, but
only a generic XML context, UTR20 is not relevant for him...
This archive was generated by hypermail 2.1.5 : Fri Nov 14 2003 - 17:23:20 EST