Re: No proper representation of Devnagari in Unicode

From: James E. Agenbroad (jage@loc.gov)
Date: Thu Nov 01 2001 - 15:50:15 EST


                                           Thursday, November 1, 2001
The following quotation from "Computer graphics in India: an architecture
for shaping Indic texts, by S.P.Mudur, Niranjan Nayak, Shrinath Shanbhag,
R.K.Soshj" [or Joshi?] (Computers & grpahics 23 (1999) may be helpful:
     3. Requirements for Indian language enabling in software
        3.1 Character encoding
     Indian text input differs from that in English. The most
significant difference of these is that in English, each keystroke maps
directly to a letter. Each letter has a unique code. A *Syllable* - the
Indian language equivalent unit of writing letter, however is composed of
one or more characters entered through the keyboard or any other input
mechanism. There are far too many syllables to be encoded separately.
     The syllable is composed of vowels, consonants, modifiers and other
special graphics signs. These are encoded, just as roman alphabets
are. The user types in a sequence of vowels, consonants, modifiers and the
graphic signs. The machine then composes syllables at run time based on
language dependent rules. Every syllable is thus represented in the
machine as a unique sequence of vowels, consonants and modifiers. In a
text sequence, these characters are stored in logical (phonetic) order.
     End of quotation. This is the phonetic encoding approach taken by
ISCII (Indian Standard 13194:1991), ISO/IEC 10646 and Unicode.
     Regards,
          Jim Agenbroad ( jage@LOC.gov )
     The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library
of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.



This archive was generated by hypermail 2.1.2 : Thu Nov 01 2001 - 16:51:25 EST