Re: transliteration in java

From: Mark Davis (mark.davis@jtcsv.com)
Date: Sat Oct 25 2003 - 15:22:36 CST


Check out ICU4J (http://oss.software.ibm.com/icu4j/). There is a demo of transliteration at http://oss.software.ibm.com/cgi-bin/icu/tr. For Cyrillic, we currently only do an ISO-based transliteration, but you can do your own custom ones.

(The demo will store custom rules that people have devised. I see that there are a couple of Cyrillic ones, as well as a number of ones we don't have in the stock ICU, such as American/Canadian Indian transliterators.)

Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄
 
  ----- Original Message -----
  From: Dennis N. Stetsenko
  To: unicode@unicode.org
  Sent: Sat, 2003 Oct 25 11:25
  Subject: transliteration in java

  Hello

  My apologies if such kind of question is too silly, but I browse quickly through resources\FAQ and did not find anything useful for me…

  I’m having bunch of files that are in Cyrillic charset and I need to transfer then to some device that is not capable to show such carset (don’t have appropriate font).

  So, I’ve decided to provide transliteration mechanism, i.e. convert chars from Cyrillic to Latin. The language that I’m going to use is Java.

  Can you guys point me on some useful resource to do so or give me some recommendation?

  =================

  I’ve made some preliminary prototyping, and results appear to be weird.

  1 I provide a mapping from a char (lets say Cyrillic) to its Latin equivalent in sense of transliteration

  2 Take the flat file and process it (convert from Cyrillic to Latin)

  Sometimes its working, sometimes its not…

  Apparently when I run simple things from my IDE it works fine, but when I’m trying to do the same in standalone mode – it skips processing.

  I was hunting down the problem and this is the difference I see:

  When I do call like this Character.UnicodeBlock.of(toProcess) for next char to transliterate, it shows

  From IDE - CYRILLIC

  Standalone - LATIN_1_SUPPLEMENT

  So, I guess the way flat file is read makes big difference… I’m willing to blame some difference in system properties settings for to such calls…

  Can you help me with pointers to make it the way it should be?

  Thanks, Dennis



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST