From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jul 07 2008 - 15:39:42 CDT
> > U+2135 ALEF SYMBOL has
> > directionality L and U+05D0 HEBREW LETTER ALEF has directionality R.
>
> Allowing normalisation to resolve to a character with different
> directionality seems to me risky. Isn't there a danger of the strong RTL
> directionality of U+05D0 messing up layout if substituted for U+2135 in
> some circumstances?
Of course. Which is one of the reasons why the mathematical Hebrew
symbols were separately encoded in the first place. And anyone
who applies an NFKD/C normalization to a mathematical expression
containing compatibility characters deserves the hash they will
get as a result.
> From a glyph perspective, the design of these two characters
> legitimately differs, since the symbol characters are often harmonised
> to Latin cap-height, while the traditional height of Hebrew text is
> between Latin cap- and x-height.
Another reason for their separate encoding.
> This seems to me a very unwelcome decomposition, but I suppose it is
> frozen thus for all time by stability agreements.
Keep in mind that in the deep prehistory of Unicode, *compatibility*
decompositions were added in part as a kind of poor mans
cross-reference tool and in part as an ideological statement
about Cleanicode by those opposed in principle to the
addition of "unnecessary" variants of "real" characters.
The architectural mistake, IMO, was in defining (much later) a normalization
form based solely on compatibility decompositions that had
much less of a consistent rationale than the canonical
decompositions, and then getting stuck with an uncorrectable
normalization form that people might end up applying in
inappropriate circumstances.
--Ken
This archive was generated by hypermail 2.1.5 : Mon Jul 07 2008 - 15:41:33 CDT