From: Mark Davis (mark.davis@jtcsv.com)
Date: Sat Jul 03 2004 - 08:23:35 CDT
You might take a look at what we have in ICU for doing transliteration. It
is rule-based, where each of the rules can take the context of surrounding
letters into account.
For information, see
http://oss.software.ibm.com/icu/userguide/Transform.html
http://oss.software.ibm.com/icu/userguide/TransformRule.html
You can try out the rules with an interactive demo at
http://oss.software.ibm.com/cgi-bin/icu/tr
Μаrk
----- Original Message -----
From: "Donald Z. Osborn" <dzo@bisharat.net>
To: <unicode@unicode.org>
Cc: <a12n-collaboration@bisharat.net>
Sent: Friday, July 02, 2004 21:52
Subject: Hausa: Boko<->Ajami? (RE: Looking for transcription or
transliteration standards latin- >arabic)
> I've read selected messages in this thread (on Unicode list) and some
messages
> bring to mind the thought of developing routines or standards to permit
> toggling back and forth between standard Latin and Arabic transcriptions
for
> the same language, such as between the Boko and Ajami writing of Hausa.
(Same
> applies to any two or three transcription systems used for particular
> languages.)
>
> One of the benefits of ICT is, theoretically anyway, that one can have
text both
> (all) ways. Which would mean that the user has options, people using
> alternative systems are not excluded, and the society does not have to
debate a
> decision of which writing system to use, etc.
>
> Because there is generally not a 1-to-1 character correspondence in
spellings in
> different transcriptions, I wonder if you don't end up having to consider
> something that operates a bit like machine translation, analyzing the
context
> of words in cases where transcription of a word in one system could be
> transliterated into something misspelled or taken as more than one word in
the
> other system. Necessarily, I think, such routines would have to be
> language-specific.
>
> Any feedback would be appreciated. TIA...
>
> Don Osborn
> Bisharat.net
>
>
>
>
>
>
>
This archive was generated by hypermail 2.1.5 : Sat Jul 03 2004 - 08:25:02 CDT