Re: Unicode transliterations (and other operations)

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Jul 03 2001 - 13:00:25 EDT

Next message: Mark Davis: "Re: Unicode transliterations (and other operations)"
Previous message: Rick McGowan: "Re: status of Jindai scripts?"
In reply to: jarkko.hietaniemi@nokia.com: "RE: Unicode transliterations (and other operations)"
Next in thread: Mark Davis: "Re: Unicode transliterations (and other operations)"
Reply: Mark Davis: "Re: Unicode transliterations (and other operations)"
Reply: Vladimir Weinstein: "Re: Unicode transliterations (and other operations)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Looks interesting. How are you approaching the complication that transliteration is between pairs of languages?

I know what you mean: Gorbachev is Gorbatschow in German.

I think that the rules that we have in ICU are probably English-centric where it makes a difference.
Note that some of the transliterator functions like uppercasing and any-name are just wrappers around Unicode functions, and so not language-dependent.

The strength of the API is that you can roll your own rules at runtime and at compile-time. If you have different rules for Finnish as a target language for transliteration, then you can modify the ICU rules or supply a whole different set for your own.
The rules are written somewhat similarly to regular expressions.

See the (draft, somewhat outdated) user guide chapter: http://oss.software.ibm.com/icu/userguide/Transliteration.html
and the API references: http://oss.software.ibm.com/icu/apiref/class_Transliterator.html and http://oss.software.ibm.com/icu/apiref/utrans_h.html

markus

Next message: Mark Davis: "Re: Unicode transliterations (and other operations)"
Previous message: Rick McGowan: "Re: status of Jindai scripts?"
In reply to: jarkko.hietaniemi@nokia.com: "RE: Unicode transliterations (and other operations)"
Next in thread: Mark Davis: "Re: Unicode transliterations (and other operations)"
Reply: Mark Davis: "Re: Unicode transliterations (and other operations)"
Reply: Vladimir Weinstein: "Re: Unicode transliterations (and other operations)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 13:48:07 EDT