From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jun 27 2003 - 19:15:36 EDT
Philippe Verdy said:
> I understand the frustration: if Unicode had not attempted to define
> combining classes, which were not necessary to Unicode, all
> existing combining characters would have been given a CC=0
> (or all the same 220 or 230 value).
Uh...., no.
Under this scheme, <a, diaeresis, underdot> would be distinct
from <a, underdot, diaeresis>, and the basis for defining a
canonical ordering which would equate them would be missing.
It's time to go back and restudy Section 3.11, Canonical
Ordering Behavior, at:
http://www.unicode.org/book/preview/ch03.pdf
> This would have left the
> compatibility with legacy encodings and with Modern Hebrew,
> without breaking Traditional Hebrew.
In this *particular* case, for Hebrew vowel marks, it would
have been sufficient to give most of them (cc=220), as
Peter has suggested in his written proposal on the subject.
The problem is not the definition of combining classes, per
se, but the overenthusiastic assignment of "fixed position
classes", especially for Arabic and Hebrew marks, which
leads, in the Hebrew case, to the loss of vowel order distinctions
that the Biblical Hebrew scholars wish to maintain, even
in text which has been normalized.
--Ken
This archive was generated by hypermail 2.1.5 : Fri Jun 27 2003 - 19:57:15 EDT