From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Mon Nov 10 2003 - 06:01:59 EST
Peter Kirk wrote:
> But does the Khmer script follow this rule? Please bear in mind that I
> know nothing about this script. But in TUS v4.0 10.4 p.281 I read:
>
> > Ordering of Syllable Components. The standard order of components in
> > an orthographic syllable as expressed in BNF is
> > B {R | C} {S {R}}* {{Z} V} {O} {S}
...
> > Z is the zero width non-joiner
...
> The first example given using ZWNJ, on p.282, starts with ba + ZWNJ +
> triisap + ii, i.e. <1794, ZWNJ, 17CA, 17B8>. 1794 is a base character
> (Lo), but 17CA and 17B8 are class 0 combining characters (Mn). The
> syntax implies that other Mn characters, e.g. robat, 17CC, may occur
> between the base character and the ZWNJ. So here is a case in natural
> language where ZWNJ may be both preceded and followed by combining
> characters, giving a technically defective combining
> sequence. Or have I misunderstood things here?
>
> Note that I am not proposing a change to Khmer, but just a clarification
> of definitions and the consistency of their application, and a good
> reason why what is allowed in Khmer would not be allowed in Hebrew.
I would see this use of ZWJ and ZWNJ as a mistake. But the publication
of this use made me propose to make ZWJ and ZWNJ into combining
characters. However, that was not accepted since that would interfere
with the Bidi algorithm. I'm not sure how bad that would be though.
(I wouldn't be surprised if it even would be beneficial, though it would
be a break in method compared to the current specification.)
/kent k
This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 06:47:33 EST