RE: Eastern Arabic-Indic Digits & Marathi Allographs

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Oct 02 2006 - 17:41:18 CST

Next message: vunzndi@vfemail.net: "Re: CJK Extension C (was: Re: Unicode 5.0 success)"

Previous message: Kenneth Whistler: "Re: Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Jarkko Ahonen asked (last week):

> Is Unicode going to have separate Unicode values for the Farsi (Persian)
> and Urdu digits as they now have same values but with glyph variation
> (digits 4, 6 and 7)?

The answer on this has been documented for some time in the
standard. See:

http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf

and look at Table 8-2, Glyph Variation in Eastern Arabic-Indic Digits.

The variation in form for the digits 4, 6, and 7 between Persian,
Sindhi, and Urdu is considered *glyph* variation for the
range of Eastern Arabic-Indic digits. It is comparable, for
example, to the kind of range of glyphs found for ASCII digits
in different parts of the world.

In fact, the main reason for distinguishing the range of Arabic
digits U+0660..U+0669 from the range of Eastern Arabic-Indic
digits U+06F0..U+06F9 in the standard at all is not the variation
in glyph forms for 4, 5, 6, and 7, but rather the distinction
in bidirectional character properties: bc=AN versus bc=EN, relevant
to several rules in the Bidirectional Algorithm.

> How about the Marathi allographs of LA (U+0932) and SHA (U+0936)?

They are allographs, as documented -- hence treated as glyph
variants of those code points. There is no intention of creating
separate encoded characters for them.

--Ken

Next message: vunzndi@vfemail.net: "Re: CJK Extension C (was: Re: Unicode 5.0 success)"
Previous message: Kenneth Whistler: "Re: Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Oct 02 2006 - 17:44:46 CST