Re: Tamil glyphs

From: Marco Cimarosti (marco.cimarosti@europe.com)
Date: Tue Sep 12 2000 - 07:26:12 EDT


Please ignore my previous message (subj "unicode@unicode.org", to Antoine,
cc uniglyph@egroups.com). Sorry about that.

Antoine Leca wrote:
> Marco.Cimarosti@icl.com wrote:
> [...]
> > In ordinary cases, a ZW[N]J inside a consonant cluster does
> not prevent
> > matra reordering. E.g., in Devanagari:
> >
> > U+0915, U+094D, U+200C, U+0915, U+093F (ka, virama,
> ZWNJ, ka, i
> > matra)
> >
> > is regularly reordered around the cluster:
> >
> > 093F, 0915, 094D, 200D, 0915 (i matra, ka, virama, ZWNJ, ka)
> >
> > and rendered with this sequence of glyphs:
> >
> > i_matra, ka_nominal, virama, ka_nominal
>
> I am not sure this is the only way to interpret the use of ZWNJ here.
> Another way would be to consider the sequence ka+halant to be
> a separate
> syllable, and then ka+i to be a second syllable. Then, the correct
> rendering would be
> ka_nominal, virama, i_matra, ka_nominal

No, I think this would not be a correct implementation.

If I remember correctly, this behaviour is described Devanagari block
chapter on the Unicode book.

Unluckily I am away from home and don't have the book with me, so I cannot
check. Please don't take my word!

ZWJ and ZWNJ should have this special meaning in conjunction with
Devanagari's virama, and any compliant renderer should implement it:

1) <consonant + virama + ZWJ> should render the "half consonant" glyph, if
available, regardless of the context.

2) <consonant + virama + ZWNJ> should render the "nominal" glyph with a
visible combining virama, regardless of the context.

Apart these special display requirements, both sequences should be
considered as an ordinary "dead consonant" (<consonant + virama>) and, if
they precede a another consonant, should regularly formate a "consonant
cluster". And, consequently, the i vowel sign should reorder around the
whole cluster.

I don't know how these rules extend to other Indic scipts. I think that #2
is general, while #1 only makes sense for other scripts having "half
consonants" (e.g. Gujarati).

> BTW, Microsoft's Uniscribe chooses the latter way.

I would call it a bug, rather than a choice.

> Similarly (and this is perhaps linked), CDAC's engine chooses the same
> "solution" for the ISCII-91 explicit halant (coded with two
> consecutive
> halants).

I don't know ISCII; it might be different from Unicode in this respect.

> > It would be nice if the possibility of reordering matras
> around a ZW[N]J
> > could be generalized, e.g. if:
> >
> > U+0915, U+200C, U+093F (ka, ZWNJ, i matra)
> >
> > would regularly reorders as:
> >
> > 093F, 0915, 200C (i matra, ka, ZWNJ)
> >
> > producing the following sequence of glyphs
> >
> > i_matra, ka
> >
> > The ZWNJ would simply be there to prevent a hypothetical
> single-glyph
> > sequence:
> >
> > ka_i_matra_ligature
>
> What is the point?
>
> Since (in the general case of a non-fixed typeface where
> glyphs are not
> always of the same width) the i_matra have to adapt itself to
> the glyph
> of the consonant, the usual form is merely
>
> i_for_ki_matra, ka
>
> Do you want to prevent this adaptation as well? If yes, what
> rendering should
> it looks like (depending of the font, the "default" i_matra
> varies greatly).

I made a silly example.

However, you catched my point: the ZWNJ would mean to use a "default"
i_matra and a "default" ka.

But, as I said, his *Devanagari* example totally pointless. I should have
used one of the *Tamil* examples that we were discussing. I didn't do it in
the first place because I am not very familiar with Tamil; but here it is:

U+0BA9, U+200C, U+0BC8 (nnna, ZWNJ, ai matra)

would regularly reorder as:

0BC8, 0BA9, 200C (ai matra, nnna, ZWNJ)

producing the following sequence of glyphs

ai_matra, nnna

The ZWNJ would simply be there to prevent the normal single-glyph sequence:

nnna_ai_matra_ligature

Notice that, unlike the rules for <virama + ZW[N]J> above, this is just my
idea, not part of the standard.

So, I would not suggest to use it in encoding, because the same text would
look different on different implementations!

_ Marco

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT