From: Peter_Constable@sil.org
Date: Wed Jul 23 2003 - 09:37:21 EDT
Philippe Verdy wrote on 07/22/2003 09:18:35 PM:
> If there's an agreement about what should have been the best
> combining classes...
Describing what would be the best combining classes can be tricky for RTL
scripts if the canonical ordering is intended not only for purposes of
normalization and string comparison but also as a preferred order for
storage and editing interaction. The reason is that the combining classes
are intentionally based on visual relative position wrt the base character,
not logical. Arbitrarily, a LTR ordering ... < below left < below < below
right < ... is used, meaning that combinations of marks will be sequenced
in the opposite order to the underlying line order, and so not in the
logical order in terms of which users will be thinking. As an example using
Hebrew, for a combination of (say) beth with qamats and dehi, preferred
classes according to the visual basis on which classes are defined would be
qamats = 220
dehi = 222
and so you'd get an encoded sequence of < beth, qamats, dehi >. But for the
user, the pre-positive dehi, being to the right of the qamats, would
probably be thought of as occuring before the qamats.
Now, I said above that the classes were based arbitrarily on a visual LTR
order. A RTL ordering ... < below right < below < below left < ... could
have been used, but then the same mismatch would exist for LTR scripts. So,
the problem is not with the arbitrary choice of LTR visual ordering for the
classes.
- Peter
---------------------------------------------------------------------------
Peter Constable
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
This archive was generated by hypermail 2.1.5 : Wed Jul 23 2003 - 10:25:01 EDT