From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Mon Nov 10 2003 - 09:28:16 EST
Philippe Verdy wrote:
> > The decompositions cannot be changed.
>
> Is it true for compatibility decomposition? When I look at the Unicode
> stability policy, I thought it only meant the canonical mappings, or
the
> fact that a canonical mapping cannot be changed to a compatibility
> mapping or the reverse, and that this mapping must remain stable.
>
> Under point #4, we have this sentence:
>
> Particularly in the situation where the Unicode Standard first
> encodes less-well documented characters and scripts, the
> exact character properties and behavior initially may not be
> well known.(...)
>
> This is our case.
And others. I'd really like to add (canonical even) decompositions
of multi-letter Hangul jamos. But we cannot even reinstate the
compatibility decompositions, since that would change the normal
forms.
> * Compatibility decomposition tags (e.g. <font> vs. <compat>)
...
> So, as the change in AU length mark does not affect its identity,
> the compatibility decomposition tag may be added.
No. If it had a compatibility decomposition, the *tag* for it may
be changed, but it cannot be removed. Nor can one be added.
The latter two changes would change the normal forms for some
strings. Note that NFKC is used for IDN (for instance), as you mention.
The stability of IDN and other uses of normal forms is the main
reason for the Unicode stability policy regarding decompositions.
(I know it has been broken for some CJK characters.)
/kent k
This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 10:09:09 EST