Re: Unicode Transliteration Guidelines released

From: Mark Davis (mark.davis@icu-project.org)
Date: Sun Jan 27 2008 - 14:19:40 CST

  • Next message: David Weinberg: "Re: Unicode Transliteration Guidelines released"

    David, Jony,

    These are not made of whole cloth. The goal of the transliteration
    schemes is to follow established sources, deviating sometimes where
    necessary for reversibility. In both of these cases, the sources are
    the UN.

    The sources are generally described in the comments in the source
    file. So for Arabic, you'd look in:

    http://www.unicode.org/cldr/data/common/transforms/Arabic-Latin.xml

    and find a reference to the UNGEGN tables: http://www.eki.ee/wgrs/rom1_ar.pdf

    Similarly for Hebrew, which also follows UNGEGN:
    http://www.eki.ee/wgrs/rom1_he.pdf

    Now of course, there may be problems in the data. If you find any, you
    can file a bug requesting a change, as described in the document. Or
    if you would like to see some alternate methods added, you are free to
    propose them (as described earlier in this thread).

    Mark

    On Jan 24, 2008 12:10 PM, David Weinberg <davidweinb@googlemail.com> wrote:
    > It seems to me that nobody is responsible for the Transliteration Guidelines.
    >
    > I have neither seen an author "defending" his stance against Richard Ishida,
    > nor anyone answering my questions:
    >
    > Does the Latin Transcription of Arabic ( http://www.unicode.org/cldr/data/charts/transforms/Latin-Arabic.html ) follow any established scheme or who invented it?
    >
    > Why did you choose to divert radically from ISO 233:1984. Transliteration of Arabic characters into Latin characters.1984-12-15? ( http://en.wikipedia.org/wiki/ISO_233) ?
    >
    > David
    >
    >

    ===============

    On Jan 27, 2008 10:38 AM, Jony Rosenne <jr@qsm.co.il> wrote:
    >
    >
    >
    >
    > I don't understand the Hebrew tables and cannot see any practical use for them. What does it mean Hebrew - Latin? How does one pronounce a Latin w? I can understand Hebrew – English, Hebrew – French, Hebrew – German, but the proposal is an absurd mixture of all of them.
    >
    >
    >
    > With the advent of Unicode, there is no reason to want a reversible transliteration to another script. If this is what one needs one could just use the Unicodes in whatever representation suits him best. What is needed is a pronounceable transliteration, and this is language based rather than script based.
    >
    >
    >
    > Jony
    >
    >
    >
    >
    > From: cldr-users-bounce@unicode.org [mailto:cldr-users-bounce@unicode.org] On Behalf Of Mark Davis
    > Sent: Sunday, January 27, 2008 7:47 PM
    > To: cfynn@gmx.net
    > Cc: David Germano; cldr-users@unicode.org
    > Subject: Re: Unicode Transliteration Guidelines released
    >
    >
    >
    >
    >
    >
    > The format for rules is specified in http://www.unicode.org/reports/tr35/#Transform_Rules
    >
    > The XML is just a series of rules and comments. You can see what is in CLDR in:
    >
    > http://www.unicode.org/cldr/data/common/transforms/
    >
    > For example, for Hebrew:
    >
    > http://www.unicode.org/cldr/data/common/transforms/Hebrew-Latin.xml
    >
    > Hope this helps,
    >
    > Mark
    >
    >
    > On Jan 27, 2008 3:34 AM, Christopher Fynn <cfynn@gmx.net> wrote:
    >
    > Hi Mark
    >
    > Where can I find the correct XML format for submitting the data? (Right now I'm
    > only interested in what applies to translitteration.) And what is URL for the on
    > line demo which can be used for testing?
    >
    > Neither of these things is clear to me from looking at
    >
    >
    > <http://www.unicode.org/cldr/transliteration_guidelines.html>
    >
    > or tr35.
    >
    > - Chris
    >
    >
    >
    >
    >
    > Mark Davis wrote:
    > > That would be useful. For submission to CLDR, we'd need to get the data
    > > in the correct XML format. Best is if the results are tested using the
    > > online demo first, since if the data doesn't validate it would not be
    > > incorporated. We can take multiple transliterations for the same
    > > script/languages, so even if one is only used in certain countries or
    > > contexts, it would be useful to have.
    > >
    > > Mark
    > >
    >
    >
    >
    >
    > --
    > Mark

    -- 
    Mark
    -- 
    Mark
    


    This archive was generated by hypermail 2.1.5 : Sun Jan 27 2008 - 14:22:41 CST