Re: Name Mixup Behind Air France Groundings

From: Tom Emerson (tree@basistech.com)
Date: Sat Jan 03 2004 - 19:23:46 EST

  • Next message: Curtis Clark: "Re: Latin letter GHA or Latin letter IO ?"

    Frank Yung-Fong Tang writes:
    > The agent probably just heard the name over a tapped phone.
    > It probably does not matter who FBI store the name after
    > that. It could be an Arabic to French transliteration read by
    > some one famliar with Arabic to English transliteration system.

    Or it could be a name read by an Egyption or a Libyan or a Saudi ---
    all of which will sound different: Gaddhafi vs. Geddafi vs. Qathafi
    vs. Kazzafi... dialectical differences between the speaker of the word
    can make name matching difficult. Even advanced name matchers like
    First Logic's cannot handle all of the Arabic transliterations that
    are available. It requires more advanced technology than simple fuzzy
    string comparisons or soundex-like algorithms.

    You run into these problems outside of names. If you are into Arabic
    music and try to find it on the web, you're in for some hard
    times... the number of different transliterations are manifest! For
    example, some of the songs from Nawal Al Zoghbi's recent album
    demonstrate:

    Elli Tmannayto === Elli Tmanetoh
    7abeeb Dialli === Habib Dialy
    Ya 7abeebi Ana === Ya Habibi Ana
    Trekni Rou7 === Trikni Rouh

    The US Government knows about these issues and is quite willing to
    take advantage of them: a recent post on another mailing list I'm on
    requested help from Arabists from an immigration lawyer. One of his
    clients was going to be deported because their last passport used "el"
    while their birth certificate (which contained their name in Arabic
    script and Latin script) used "al". The contention was that "el" and
    "al" were different words and therefore the documents did not
    match. Yet the USG knows that "el" and "al" are merely orthographic
    variants of the Arabic definite article alif-lamm and want them
    treated the same. <sigh>

    > Unicode do not solve "transliteration" issue at all. There are
    > multiple Arabic transliteration system available. Even the
    > ISO standard Arabic transliation system is not 100% adopted by
    > some Arabic speaking country.

    I would be surprised if the ISO system is used by anyone but ISO.

    > Remember, all the airline still use ASCII only for name these
    > day on our borading pass. The problem could be in the airline
    > side instead of the FBI side.

    FBI, CIA, NSA, DIA, Immigration, LoC... they're all having problems
    with this. And the agencies rarely share information, and only now are
    they starting to define a common (though non-reversable)
    transliteration scheme: each agency has their own, and often different
    parts of the _same_ agency will have their own. And this is true for
    other languages (esp. Farsi and Pashto) as well.

        -tree

    -- 
    Tom Emerson                                          Basis Technology Corp.
    Software Architect                                 http://www.basistech.com
      "Beware the lollipop of mediocrity: lick it once and you suck forever"
    


    This archive was generated by hypermail 2.1.5 : Sat Jan 03 2004 - 19:59:25 EST