RE: Unicode transliterations (and other operations)

From: jarkko.hietaniemi@nokia.com
Date: Tue Jul 03 2001 - 14:56:06 EDT


> I know what you mean: Gorbachev is Gorbatschow in German.

Gorbatsov in Finnish transliteration, the "ch" would be very unwieldy
for a Finnish mouth. (The "s" is used solely in transliteration, not
in Finnish proper.)

> I think that the rules that we have in ICU are probably
> English-centric where it makes a difference.
> Note that some of the transliterator functions like
> uppercasing and any-name are just wrappers around Unicode
> functions, and so not language-dependent.
>
> The strength of the API is that you can roll your own rules
> at runtime and at compile-time. If you have different rules
> for Finnish as a target language for transliteration, then
> you can modify the ICU rules or supply a whole different set
> for your own.
> The rules are written somewhat similarly to regular expressions.
>
> See the (draft, somewhat outdated) user guide chapter:
> http://oss.software.ibm.com/icu/userguide/Transliteration.html

One thing you could update in this page is the very first line :-)
where it is claimed that transliteration is between scripts...



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 13:48:07 EDT