Dear Mr.Marco,

Sorry for sending an unsolicited mail to you.

I am interested in knowing alot about the UNICODE system. I own a translation agency in India, by the name MULTI-LINGUIST. We have been into the translation business for just about one year, and wish to inculcate the Unicode system.

Besided what I have downloaded from the website of Unicode, can you help me by clearing the following doubts: (all questions relating to Indian languages)

1) Is typing in Unicode different from that of the regular ttf fonts?

2) Are there special fonts for providing translations in Unicode format?

3) If YES, then how can we avail those fonts? If NO, then is it that the same ttf fonts are converted to Unicode fonts. Incase separate fonts exist, is the keyboard arrangement totally different or the same?

4) Is it possible to convert some text in a WORD doc into Unicode text, b y using some commands?

5) Is there any program for Unicode?

I am really enthusiastic to know about Unicode in depth. Can you suggest how to go about all this?

Best regards

Paresh Agarwal

----- Original Message -----

From: Marco Cimarosti

To: Unicode List

Cc: uniglyph@egroups.com

Sent: Tuesday, September 12, 2000 4:46 PM

Subject: Re: Tamil glyphs

Please ignore my previous message (subj "mailto:unicode@unicode.org, to Antoine,
cc uniglyph@egroups.com). Sorry about that.

Antoine Leca wrote:
> Marco.Cimarosti@icl.com wrote:
> [...]
> > In ordinary cases, a ZW[N]J inside a consonant cluster does
> not prevent
> > matra reordering. E.g., in Devanagari:
> >
> >         U+0915, U+094D, U+200C, U+0915, U+093F (ka, virama,
> ZWNJ, ka, i
> > matra)
> >
> > is regularly reordered around the cluster:
> >
> >         093F, 0915, 094D, 200D, 0915 (i matra, ka, virama, ZWNJ, ka)
> >
> > and rendered with this sequence of glyphs:
> >
> >         i_matra, ka_nominal, virama, ka_nominal
>
> I am not sure this is the only way to interpret the use of ZWNJ here.
> Another way would be to consider the sequence ka+halant to be
> a separate
> syllable, and then ka+i to be a second syllable. Then, the correct
> rendering would be
>           ka_nominal, virama, i_matra, ka_nominal

No, I think this would not be a correct implementation.

If I remember correctly, this behaviour is described Devanagari block
chapter on the Unicode book.

Unluckily I am away from home and don't have the book with me, so I cannot
check. Please don't take my word!

ZWJ and ZWNJ should have this special meaning in conjunction with
Devanagari's virama, and any compliant renderer should implement it:

1) <consonant + virama + ZWJ> should render the "half consonant" glyph, if
available, regardless of the context.

2) <consonant + virama + ZWNJ> should render the "nominal" glyph with a
visible combining virama, regardless of the context.

Apart these special display requirements, both sequences should be
considered as an ordinary "dead consonant" (<consonant + virama>) and, if
they precede a another consonant, should regularly formate a "consonant
cluster". And, consequently, the i vowel sign should reorder around the
whole cluster.

I don't know how these rules extend to other Indic scipts. I think that #2
is general, while #1 only makes sense for other scripts having "half
consonants" (e.g. Gujarati).

> BTW, Microsoft's Uniscribe chooses the latter way.

I would call it a bug, rather than a choice.

> Similarly (and this is perhaps linked), CDAC's engine chooses the same
> "solution" for the ISCII-91 explicit halant (coded with two
> consecutive
> halants).

I don't know ISCII; it might be different from Unicode in this respect.

> > It would be nice if the possibility of reordering matras
> around a ZW[N]J
> > could be generalized, e.g. if:
> >
> >         U+0915, U+200C, U+093F (ka, ZWNJ, i matra)
> >
> > would regularly reorders as:
> >
> >         093F, 0915, 200C (i matra, ka, ZWNJ)
> >
> > producing the following sequence of glyphs
> >
> >         i_matra, ka
> >
> > The ZWNJ would simply be there to prevent a hypothetical
> single-glyph
> > sequence:
> >
> >         ka_i_matra_ligature
>
> What is the point?
>
> Since (in the general case of a non-fixed typeface where
> glyphs are not
> always of the same width) the i_matra have to adapt itself to
> the glyph
> of the consonant, the usual form is merely
>
>           i_for_ki_matra, ka
>
> Do you want to prevent this adaptation as well? If yes, what
> rendering should
> it looks like (depending of the font, the "default" i_matra
> varies greatly).

I made a silly example.

However, you catched my point: the ZWNJ would mean to use a "default"
i_matra and a "default" ka.

But, as I said, his *Devanagari* example totally pointless. I should have
used one of the *Tamil* examples that we were discussing. I didn't do it in
the first place because I am not very familiar with Tamil; but here it is:

U+0BA9, U+200C, U+0BC8 (nnna, ZWNJ, ai matra)

would regularly reorder as:

0BC8, 0BA9, 200C (ai matra, nnna, ZWNJ)

producing the following sequence of glyphs

ai_matra, nnna

The ZWNJ would simply be there to prevent the normal single-glyph sequence:

nnna_ai_matra_ligature

Notice that, unlike the rules for <virama + ZW[N]J> above, this is just my
idea, not part of the standard.

So, I would not suggest to use it in encoding, because the same text would
look different on different implementations!

_ Marco

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup