From: Mark Davis (mark.davis@jtcsv.com)
Date: Thu Apr 28 2005 - 17:42:18 CST
I had sent the following a few days ago, but was having some email problems
so it didn't make it through.
---- Yes, there are many different transliteration schemes. ICU follows ISO 15919 mostly (we had to fill in a few holes where the standard transcribed instead of transliterated). If you want to see an example, go to http://ibm.com/software/globalization/icu/demo/transform In the Input box, paste in: यूनिकोड क्या है? यूनिकोड प्रत्येक अक्षर के लिए एक विशेष नम्बर प्रदान करता है, चाहे कोई भी प्लैटफॉर्म हो, चाहे कोई भी प्रोग्राम हो, चाहे कोई भी भाषा हो। Set Source 1 to Any, and Target 1 to Latin, and hit the Transform button. You'll get in Output 2 the following: yūnikōḍa kyā hai? yūnikōḍa pratyēka akṣara kē li'ē ēka viśēṣa nambara pradāna karatā hai, cāhē kō'ī bhī plaiṭaphŏrma hō, cāhē kō'ī bhī prōgrāma hō, cāhē kō'ī bhī bhāṣā hō. If you set Source 2 to Any, and Target 2 to Latin, and hit Transform, then you'll get the text transformed back. (Or you can pick different other targets. How well this all renders is up to your browser and available fonts. Mark ----- Original Message ----- From: "John Hudson" <tiro@tiro.com> To: "Chetan Pandey" <chetanpandey@yahoo.com> Cc: <unicode@unicode.org> Sent: Monday, April 25, 2005 22:06 Subject: Re: Transliterator > Chetan Pandey wrote: > > > [a + BAR ABOVE] for "aa" as in balm, > > [i + BAR ABOVE] for "ii" as in meat, > > [u + BAR ABOVE] for "uu" as in boot, > > [a + BAR ABOVE] for "aa" as in balm, > > [m + DOT ABOVE } for M as in saMgiita > > > If someone can pls tell me what this Scheme is called and where it is > > represented in Unicode, I will be very grateful. > > There are two Latin transliteration systems for Hindi that use these characters, ISO 15919 > (2001) and the United Nations standard (1977). These systems are very similar, but there > are differences in the transliteration of a few vowels and a couple of consonants. For > more information see this PDF: > > http://transliteration.eki.ee/pdf/Hindi-Marathi-Nepali.pdf > > > Not all of the diacritics used in these transliteration systems are encoded in Unicode as > combined letter + mark combinations. For some of them you will need to use sequences of > base letters and combining marks. > > John Hudson > > -- > > Tiro Typeworks www.tiro.com > Vancouver, BC tiro@tiro.com > > Currently reading: > A century of philosophy, by Hans Georg Gadamer > Q, by 'Luther Blissett' > > > Mark ----- Original Message ----- From: "Markus Scherer" <markus.icu@gmail.com> To: <unicode@unicode.org> Sent: Thursday, April 28, 2005 15:24 Subject: Re: Transliterator > On 4/25/05, Chetan Pandey <chetanpandey@yahoo.com> wrote: > > I am trying to build a Java program that will convert Devanagari Input into > > the English Transliteration System... > > You might be able to use ICU, which has built-in transliteration > between all Indic scripts and Latin. If you need different rules, you > can supply your own rule set to ICU's Transliterator API. > > Try the Transform demo on > http://www-306.ibm.com/software/globalization/icu/chartsdemostools.jsp > > with Source 1 = Devanagari and Target 1 = Latin. > > Best regards, > markus > > > >
This archive was generated by hypermail 2.1.5 : Thu Apr 28 2005 - 17:44:09 CST