Re: Re: Regd- ISCII to Unicode Converter!

From: Mark Davis (mark@macchiato.com)
Date: Fri Apr 05 2002 - 21:39:46 EST


Have you checked out http://www.unicode.org/unicode/faq/indic.html#16?

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Ram Viswanadha" <ram@jtcsv.com>
To: "Marco Cimarosti" <marco.cimarosti@essetre.it>; "Markus Scherer"
<markus.scherer@jtcsv.com>
Cc: <unicode@unicode.org>; <Federic.Zhang@sun.com>
Sent: Thursday, April 04, 2002 16:43
Subject: Re: Re: Regd- ISCII to Unicode Converter!

> Marco,
>
> > Why do you say that these are not round-trip compatible?
> The point I was trying to make is conversion of INV->ZWJ can be
thought as
> being kind of fallback, you might be able to roundtrip in most
cases but
> not all. I do agree the conversions you pointed out can be
roundtripped. But
> does it mean that if I have an ISCII stream, converted it to Unicode
would I
> be able to render the stream correctly? I think no.
>
> > Does ISCII have
> > VOWEL SIGN VOCALLIC L, VOWEL SIGN VOCALLIC RR, VOWEL SIGN VOCALLIC
LL?
>
> Yes it does in combination with NUKTA.
>
> 0xAA, 0xE9,/* RI + NUKTA => 0x0960 Vocallic RR*/
>
> 0xDF, 0xE9,/* Vowel sign RI + NUKTA => 0x0944 Vowel Sign
Vocallic
> RR*/
>
> 0xa6, 0xE9,/* Vowel I + NUKTA => 0x090C Vowel Vocallic L*/
>
> 0xdb, 0xE9,/* Vowel sign I + Nukta => 0x0962 Vowel Sign
Vocallic L*/
>
> 0xa7, 0xE9,/* Vowel II + NUKTA => 0x0961 Vowel Vocallic LL*/
>
> 0xdc, 0xE9,/* Vowel sign II + Nukta => 0x0963 Vowel Sign
Vocallic
> LL*/
>
> 0xa1, 0xE9,/* chandrabindu + Nukta => 0x0950 Om*/
>
> 0xEA, 0xE9, /* Danda + Nukta => 0x093D Avagraha*/
>
>
> > > 4) INV+HALANT+RA => RAsub
> >
> > I think that there is no reason why ZWJ+HALANT+RA alone shouldn't
> represent
> > RAsub in Unicode as well.
> > Actually, I think that also HALANT+RA alone should be enough to
represent
> > RAsub (in Unicode, at least). But ZWJ should not harm, so one may
retain
> it
> > for round-trip compatibility with ISCII's INV.
>
> You are correct if ZWJ is treated like any other consonant, which is
unclear
> from rendering rules, so
> applications have a choice to try and do the right thing or do
nothing.
> I tried to see how the combinations below are rendered in Notepad on
Win2000
> and our Layout demo, and they
> donot render HALANT+RA as RAsub
>
> ISCII Rendered
> ==== =======
> KA+INV+HALANT+RA KA |RAsub| /*RAsub does not
combine with
> KA */
> INV+HALANT+RA RAsub
>
> Converted to Unicode:
>
> Unicode Rendered
> ====== =======
> KA+ZWJ+HALANT+RA KA |HALANT| RA
> ZWJ+HALANT+RA |HALANT|RA
>
> /* Or even */
> HALANT+RA |HALANT|RA
>
> Regards,
>
> Ram
> ---------------------------------------------------
> Ram Viswanadha
> International Components For Unicode
> GCoC San Jose
> IBM
>
>
>



This archive was generated by hypermail 2.1.2 : Fri Apr 05 2002 - 22:21:34 EST