Re: New RFC 4645-4647 (language tags)

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Sep 11 2006 - 08:43:54 CDT

  • Next message: Doug Ewell: "Re: New RFC 4645-4647 (language tags)"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > RFC 4646 is really a magnifical construction, given the complexity of
    > preserving the compatibility with the legacy, and the role of the
    > various registration agencies). I had already seen the first beta
    > version of the ILSR on the IANA website, but now this model brings a
    > clear understanding about how to manage language tags, and alone, it
    > solves most of the problems caused by "equivalent" codes and how to
    > canonicalize them.

    IANA never hosted a beta version of the Registry. The RFC 4646 rocess
    has actually been "live" since last November, and the official Registry
    has undergone several normal changes since then, but without a formal
    RFC number and without the "BCP 47" mapping having been formally
    transferred from RFC 3066.

    > A very long reading, with many subtle details. Let's hope that the
    > IANA registry will now be fully completed (notably the
    > "Suppress-Script:" field which is still often missing for many obvious
    > languages, or "Description:" whose value should match the other
    > standards, or "Preferred-Value:" mappings for equivalences)

    Not every language will have a Suppress-Script. This mechanism is
    intended for languages that are overwhelmingly written in a particular
    script ("the majority of the time" is not enough), so that taggers can
    avoid "obvious" combinations like fr-Latn and hi-Deva that may not be
    understood by RFC 3066-aware, 4646-unaware parsers.

    Requests to add more Suppress-Script values should be sent to the
    ietf-languages mailing list:
    http://www.alvestrand.no/mailman/listinfo/ietf-languages
    Note that this list is a URL redirect of the ietf-languages@iana.org
    list mentioned in RFC 4646.

    Descriptions almost always do match the core ISO standards (639, 3166,
    15924). Please send me (OFF-LIST) a list of those that you feel do not.

    Preferred-Value is present only to map deprecated subtags to their
    replacements. I'm not sure what other "equivalences" you have in mind.

    > What is surprising me is that RFC 4646 has defined reserves for future
    > ISO 639 extensions, but only with 4-letter codes:
    > * 3 letter-codes are explicitly restricted _only_ to ISO 639-2,
    > * but not for ISO-639-3 (that extends the set of 3 letter codes, in
    > such a way that most of the new 3-letter ISO 639-3 codes won't be
    > usable as primary language subtags).
    > So RFC4646 is already almost deprecating the now very advanced ISO
    > 639-3 ongoing work (whose core text was already adopted long before
    > RFC4646, even though the associated database is still in beta stage),
    > just a few months before it gets finalized (and applications of ISO
    > 639-3 are already being developed and deployed: how will those
    > applications be compatible now with the new RFC 4646 ???

    ISO 639-3 is not a formally published standard yet, so the RFC cannot
    support it. But we are anticipating that it will be published by the
    end of this year or early next year, and we are already preparing an
    updated RFC that will add support for 639-3. And we will add almost
    7,000 languages from 639-3 into the Registry.

    This effort is taking place in the Language Tag Registry Update (LTRU)
    working group of the IETF. For more information, see:
    http://www.ietf.org/html.charters/ltru-charter.html

    > I thought that there should have been provisions kept in RFC4646 for
    > compatibility with ISO 639-3. But with the current RFC text, the new
    > 3-letter ISO-639-3 codes will be usable with RFC 4646 only as language
    > extension subtags (after another generic ISO639-2 or ISO 639-1
    > language subtag), unless these new ISO 639-3 codes are later imported
    > into ISO 639-2!

    Be patient, please. RFCs are not allowed to specify the use of draft
    standards. If anyone else is deploying 639-3 before its release, they
    are not conformant either.

    --
    Doug Ewell
    Fullerton, California, USA
    http://users.adelphia.net/~dewell/
    RFC 4645  *  UTN #14
    


    This archive was generated by hypermail 2.1.5 : Mon Sep 11 2006 - 08:48:29 CDT