Re: Decomposition of Indic vowels

From: Mark Davis (marked@best.com)
Date: Sun Nov 01 1998 - 19:25:18 EST


Due to the structure of Indic scripts, there are multiple models that
could have been used in encoding them. For example:

1. no dependents encoded:
  each <dependent vowel> is coded as <virama> +- <independent vowel>
2. no independents encoded:
  each <independent vowel> (except 0905 <A>) is coded as <A> +- <dependent
vowel>
3. both are encoded.

In the Unicode Standard, we explicitly chose to follow ISCII use model 3.
That means that the "alternative" encodings in 1 and 2 are explicitly not
equivalent, and should not render the same as the explictly encoded
characters.

John Cowan wrote:

> I processed Unidata 2.1.5 (using the "ex" editor, plus sorting!)
> to produce a set of properties for the currently undecomposable Indic
> vowels. I excluded Thai, Lao, and Tibetan, which don't seem to
> work on the same principles.
>
> This is not a proposal, just a specimen. I don't have the
> necessary knowledge to make it a proposal.
>
> 093E;DEVANAGARI VOWEL SIGN AA;Mc;0;L;<vowel> 0906;;;;N;;;;;
> 093F;DEVANAGARI VOWEL SIGN I;Mc;0;L;<vowel> 0907;;;;N;;;;;
> 0940;DEVANAGARI VOWEL SIGN II;Mc;0;L;<vowel> 0908;;;;N;;;;;
> 0941;DEVANAGARI VOWEL SIGN U;Mn;38;ON;<vowel> 0909;;;;N;;;;;
> 0942;DEVANAGARI VOWEL SIGN UU;Mn;39;ON;<vowel> 090A;;;;N;;;;;
> 0943;DEVANAGARI VOWEL SIGN VOCALIC R;Mn;40;ON;<vowel> 090B;;;;N;;;;;
> 0944;DEVANAGARI VOWEL SIGN VOCALIC RR;Mn;41;ON;<vowel> 0960;;;;N;;;;;
> 0945;DEVANAGARI VOWEL SIGN CANDRA E;Mn;42;ON;<vowel> 090D;;;;N;;;;;
> 0946;DEVANAGARI VOWEL SIGN SHORT E;Mn;43;ON;<vowel> 090E;;;;N;;;;;
> 0947;DEVANAGARI VOWEL SIGN E;Mn;44;ON;<vowel> 090F;;;;N;;;;;
> 0948;DEVANAGARI VOWEL SIGN AI;Mn;45;ON;<vowel> 0910;;;;N;;;;;
> 0949;DEVANAGARI VOWEL SIGN CANDRA O;Mc;0;L;<vowel> 0911;;;;N;;;;;
> 094A;DEVANAGARI VOWEL SIGN SHORT O;Mc;0;L;<vowel> 0912;;;;N;;;;;
> 094B;DEVANAGARI VOWEL SIGN O;Mc;0;L;<vowel> 0913;;;;N;;;;;
> 094C;DEVANAGARI VOWEL SIGN AU;Mc;0;L;<vowel> 0914;;;;N;;;;;
> 0962;DEVANAGARI VOWEL SIGN VOCALIC L;Mn;48;ON;<vowel> 090C;;;;N;;;;;
> 0963;DEVANAGARI VOWEL SIGN VOCALIC LL;Mn;49;ON;<vowel> 0961;;;;N;;;;;
> 09BE;BENGALI VOWEL SIGN AA;Mc;0;L;<vowel> 0986;;;;N;;;;;
> 09BF;BENGALI VOWEL SIGN I;Mc;0;L;<vowel> 0987;;;;N;;;;;
> 09C0;BENGALI VOWEL SIGN II;Mc;0;L;<vowel> 0988;;;;N;;;;;
> 09C1;BENGALI VOWEL SIGN U;Mn;51;ON;<vowel> 0989;;;;N;;;;;
> 09C2;BENGALI VOWEL SIGN UU;Mn;52;ON;<vowel> 098A;;;;N;;;;;
> 09C3;BENGALI VOWEL SIGN VOCALIC R;Mn;53;ON;<vowel> 098B;;;;N;;;;;
> 09C4;BENGALI VOWEL SIGN VOCALIC RR;Mn;54;ON;<vowel> 09E0;;;;N;;;;;
> 09C7;BENGALI VOWEL SIGN E;Mc;0;L;<vowel> 098F;;;;N;;;;;
> 09C8;BENGALI VOWEL SIGN AI;Mc;0;L;<vowel> 0990;;;;N;;;;;
> 09E2;BENGALI VOWEL SIGN VOCALIC L;Mn;55;ON;<vowel> 098C;;;;N;;;;;
> 09E3;BENGALI VOWEL SIGN VOCALIC LL;Mn;56;ON;<vowel> 09E1;;;;N;;;;;
> 0A3E;GURMUKHI VOWEL SIGN AA;Mc;0;L;<vowel> 0A06;;;;N;;;;;
> 0A3F;GURMUKHI VOWEL SIGN I;Mc;0;L;<vowel> 0A07;;;;N;;;;;
> 0A40;GURMUKHI VOWEL SIGN II;Mc;0;L;<vowel> 0A08;;;;N;;;;;
> 0A41;GURMUKHI VOWEL SIGN U;Mn;58;ON;<vowel> 0A09;;;;N;;;;;
> 0A42;GURMUKHI VOWEL SIGN UU;Mn;59;ON;<vowel> 0A0A;;;;N;;;;;
> 0A47;GURMUKHI VOWEL SIGN EE;Mn;60;ON;<vowel> 0A0F;;;;N;;;;;
> 0A48;GURMUKHI VOWEL SIGN AI;Mn;61;ON;<vowel> 0A10;;;;N;;;;;
> 0A4B;GURMUKHI VOWEL SIGN OO;Mn;62;ON;<vowel> 0A13;;;;N;;;;;
> 0A4C;GURMUKHI VOWEL SIGN AU;Mn;63;ON;<vowel> 0A14;;;;N;;;;;
> 0ABE;GUJARATI VOWEL SIGN AA;Mc;0;L;<vowel> 0A86;;;;N;;;;;
> 0ABF;GUJARATI VOWEL SIGN I;Mc;0;L;<vowel> 0A87;;;;N;;;;;
> 0AC0;GUJARATI VOWEL SIGN II;Mc;0;L;<vowel> 0A88;;;;N;;;;;
> 0AC1;GUJARATI VOWEL SIGN U;Mn;68;ON;<vowel> 0A89;;;;N;;;;;
> 0AC2;GUJARATI VOWEL SIGN UU;Mn;69;ON;<vowel> 0A8A;;;;N;;;;;
> 0AC3;GUJARATI VOWEL SIGN VOCALIC R;Mn;70;ON;<vowel> 0A8B;;;;N;;;;;
> 0AC4;GUJARATI VOWEL SIGN VOCALIC RR;Mn;71;ON;<vowel> 0AE0;;;;N;;;;;
> 0AC5;GUJARATI VOWEL SIGN CANDRA E;Mn;72;ON;<vowel> 0A8D;;;;N;;;;;
> 0AC7;GUJARATI VOWEL SIGN E;Mn;73;ON;<vowel> 0A8F;;;;N;;;;;
> 0AC8;GUJARATI VOWEL SIGN AI;Mn;74;ON;<vowel> 0A90;;;;N;;;;;
> 0AC9;GUJARATI VOWEL SIGN CANDRA O;Mc;0;L;<vowel> 0A91;;;;N;;;;;
> 0ACB;GUJARATI VOWEL SIGN O;Mc;0;L;<vowel> 0A93;;;;N;;;;;
> 0ACC;GUJARATI VOWEL SIGN AU;Mc;0;L;<vowel> 0A94;;;;N;;;;;
> 0B3E;ORIYA VOWEL SIGN AA;Mc;0;L;<vowel> 0B06;;;;N;;;;;
> 0B3F;ORIYA VOWEL SIGN I;Mn;76;ON;<vowel> 0B07;;;;N;;;;;
> 0B40;ORIYA VOWEL SIGN II;Mc;0;L;<vowel> 0B08;;;;N;;;;;
> 0B41;ORIYA VOWEL SIGN U;Mn;77;ON;<vowel> 0B09;;;;N;;;;;
> 0B42;ORIYA VOWEL SIGN UU;Mn;78;ON;<vowel> 0B0A;;;;N;;;;;
> 0B43;ORIYA VOWEL SIGN VOCALIC R;Mn;79;ON;<vowel> 0B0B;;;;N;;;;;
> 0B47;ORIYA VOWEL SIGN E;Mc;0;L;<vowel> 0B0F;;;;N;;;;;
> 0BBE;TAMIL VOWEL SIGN AA;Mc;0;L;<vowel> 0B86;;;;N;;;;;
> 0BBF;TAMIL VOWEL SIGN I;Mc;0;L;<vowel> 0B87;;;;N;;;;;
> 0BC0;TAMIL VOWEL SIGN II;Mn;80;ON;<vowel> 0B88;;;;N;;;;;
> 0BC1;TAMIL VOWEL SIGN U;Mc;0;L;<vowel> 0B89;;;;N;;;;;
> 0BC2;TAMIL VOWEL SIGN UU;Mc;0;L;<vowel> 0B8A;;;;N;;;;;
> 0BC6;TAMIL VOWEL SIGN E;Mc;0;L;<vowel> 0B8E;;;;N;;;;;
> 0BC7;TAMIL VOWEL SIGN EE;Mc;0;L;<vowel> 0B8F;;;;N;;;;;
> 0BC8;TAMIL VOWEL SIGN AI;Mc;0;L;<vowel> 0B90;;;;N;;;;;
> 0C3E;TELUGU VOWEL SIGN AA;Mn;81;ON;<vowel> 0C06;;;;N;;;;;
> 0C3F;TELUGU VOWEL SIGN I;Mn;82;ON;<vowel> 0C07;;;;N;;;;;
> 0C40;TELUGU VOWEL SIGN II;Mn;83;ON;<vowel> 0C08;;;;N;;;;;
> 0C41;TELUGU VOWEL SIGN U;Mc;0;L;<vowel> 0C09;;;;N;;;;;
> 0C42;TELUGU VOWEL SIGN UU;Mc;0;L;<vowel> 0C0A;;;;N;;;;;
> 0C43;TELUGU VOWEL SIGN VOCALIC R;Mc;0;L;<vowel> 0C0B;;;;N;;;;;
> 0C44;TELUGU VOWEL SIGN VOCALIC RR;Mc;0;L;<vowel> 0C60;;;;N;;;;;
> 0C46;TELUGU VOWEL SIGN E;Mn;84;ON;<vowel> 0C0E;;;;N;;;;;
> 0C47;TELUGU VOWEL SIGN EE;Mn;85;ON;<vowel> 0C0F;;;;N;;;;;
> 0C4A;TELUGU VOWEL SIGN O;Mn;87;ON;<vowel> 0C12;;;;N;;;;;
> 0C4B;TELUGU VOWEL SIGN OO;Mn;88;ON;<vowel> 0C13;;;;N;;;;;
> 0C4C;TELUGU VOWEL SIGN AU;Mn;89;ON;<vowel> 0C14;;;;N;;;;;
> 0CBE;KANNADA VOWEL SIGN AA;Mc;0;L;<vowel> 0C86;;;;N;;;;;
> 0CBF;KANNADA VOWEL SIGN I;Mn;92;ON;<vowel> 0C87;;;;N;;;;;
> 0CC1;KANNADA VOWEL SIGN U;Mc;0;L;<vowel> 0C89;;;;N;;;;;
> 0CC2;KANNADA VOWEL SIGN UU;Mc;0;L;<vowel> 0C8A;;;;N;;;;;
> 0CC3;KANNADA VOWEL SIGN VOCALIC R;Mc;0;L;<vowel> 0C8B;;;;N;;;;;
> 0CC4;KANNADA VOWEL SIGN VOCALIC RR;Mc;0;L;<vowel> 0CE0;;;;N;;;;;
> 0CC6;KANNADA VOWEL SIGN E;Mn;93;ON;<vowel> 0C8E;;;;N;;;;;
> 0CCC;KANNADA VOWEL SIGN AU;Mn;94;ON;<vowel> 0C94;;;;N;;;;;
> 0D3E;MALAYALAM VOWEL SIGN AA;Mc;0;L;<vowel> 0D06;;;;N;;;;;
> 0D3F;MALAYALAM VOWEL SIGN I;Mc;0;L;<vowel> 0D07;;;;N;;;;;
> 0D40;MALAYALAM VOWEL SIGN II;Mc;0;L;<vowel> 0D08;;;;N;;;;;
> 0D41;MALAYALAM VOWEL SIGN U;Mn;95;ON;<vowel> 0D09;;;;N;;;;;
> 0D42;MALAYALAM VOWEL SIGN UU;Mn;96;ON;<vowel> 0D0A;;;;N;;;;;
> 0D43;MALAYALAM VOWEL SIGN VOCALIC R;Mn;97;ON;<vowel> 0D0B;;;;N;;;;;
> 0D46;MALAYALAM VOWEL SIGN E;Mc;0;L;<vowel> 0D0E;;;;N;;;;;
> 0D47;MALAYALAM VOWEL SIGN EE;Mc;0;L;<vowel> 0D0F;;;;N;;;;;
> 0D48;MALAYALAM VOWEL SIGN AI;Mc;0;L;<vowel> 0D10;;;;N;;;;;
>
> --
> John Cowan http://www.ccil.org/+AH4-cowan cowan@ccil.org
> You tollerday donsk? N. You tolkatiff scowegian? Nn.
> You spigotty anglease? Nnn. You phonio saxo? Nnnn.
> Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)

--
business: medavis2@us.ibm.com, mark@unicode.org
personal: mark@macchiato.com, http://www.macchiato.com
--



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT