For those interested in transliteration, the newest version of ICU has some
facilities you may want to take a look at. You can see an online example of
this with Locale Explorer (for links, see below). For example, if you look
at the locale data for Greece, you'll see the days of the months in the
Greek script. If you scroll to the bottom and turn transliteration on, then
you will see the data transliterated into Latin.
The transliterators are generally rule-based, and thus can be customized
for different environments. The current rule sets are initial versions and
will be refined over time. Just for testing, we have a simple rule set for
ideographic characters that uses the Unihan.txt English definitions; for
example, with this rule set the word "Japanese" comes out as
"[sun][root][language] ".
Links
Locale Explorer:
http://www-4.ibm.com/software/developer/features/localeexplorer.html
Greek Data:
http://oss.software.ibm.com/developerworks/opensource/icu/localeexplorer/en_US/?_=el_GR&
Notes:
http://oss.software.ibm.com/developerworks/opensource/icu/project/localeexplorer/transliteration.html
Mark
___
Mark Davis, IBM Center for Java Technology, Cupertino
(408) 777-5850 [fax: 5891], mark.davis@us.ibm.com, president@unicode.org
http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=10275+N.+De+Anza&csz=95014
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT