RE: ISO 639-3 beta input form (was: Questions re ISO-639-1,2,3)

From: Peter Constable (petercon@microsoft.com)
Date: Thu Aug 25 2005 - 07:56:13 CDT

  • Next message: Peter Constable: "RE: Windows Glyph Handling"

    > From: Philippe Verdy [mailto:verdy_p@wanadoo.fr]

    > I see also in the ISO639-3 list of languages that the input form for
    > selecting language names lacks an option for languages that don't
    start by
    > a letter

    Thanks for that feedback.

    > They have all Scope=I (Individual language), but Type=L (Living) or
    Type=E
    > (Extinct). Is that because they are still aliases, or still not
    specified
    > completely (notably their standard English names)? If not, then there
    > shoudl
    > be a "Other" option in the form input selector.

    ??? They are living or recently-extinct individual languages. What is
    unclear about that?

    > I'd like to know if these "!" or "/"
    > or
    > "=" or "'" are used to replace unencoded characters or diacritics, or
    it's
    > a
    > technical issue on the Ethnologue.com and SIL.org web sites...

    It's a limitation of ASCII -- the database involved has been around for
    a long time -- something like 30 years.

    > I suspect that "/" means the combining slash overlay, and "//" the
    > combining
    > double-slash overlay, I suspect the quote to to mean the apostrophe
    letter,
    > but what does the equal sign mean? I also suspect that those languages
    > don't have known orthographies (only spoken for now)...

    I can't comment on the phonemes that underly those characters. I suspect
    that the slashes do not mean combining overlays. It is by no means safe
    to assume that none of these languages have orthographies; one would
    need to investigate to ascertain that.

    > Are there projects to include in ISO 639-3 the alias names listed by
    > Ethnologue?

    ISO 639-3 requires only that a unique reference name be listed for each
    language. It gives the registration authority freedom to document other
    names.

    > Are there projects to list the ISO 639-3 codes of individual languages
    > refered by languages with Scope=C (collective languages
    , such as
    > "Afro-Asiatic (Other)" whose 639-2 code is "afa", or as "Bihari" whose
    > 639-1
    > and 639-2 codes are "bh" and "bih", but that won't have ISO 639-3
    codes)?
    > Same question for Scope=M (macrolanguages)?

    Macrolanguages are listed in the data table for part 3. Identifiers for
    collections of languages will be the focus of ISO 639-5.

    > Or instead to include this reference within the meta-data associated
    to
    > each
    > individual language (and so avoiding to change long lists of codes in
    the
    > meta-data associated with collective languages.)?

    Collections will not be understood as remainders; they will include
    languages that have their own ID.

    > Finally, is ISO 639-3 meant to be used for tagging more precisely the
    > various written or spoken texts or other localized data? What will be
    the
    > relation of ISO 639-3 with BCP 47 (notably will the ISO 639-1 and -2
    codes,
    > when they exist, be still preferable to the ISO 639-3 codes? I think
    it
    > should, so that existing documents and localized data won't need to be
    > updated with new language codes)

    That is the plan.

     
    > I also hope that there's no conflict between 3-letter ISO 639-2 codes
    and
    > 3-letter ISO 639-3 codes

    No, there is not.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Thu Aug 25 2005 - 07:58:48 CDT