/|/|ike Ayers <Mike_Ayers@bmc.com> wrote:
> BTW, I've gotten confused during this thread over the naming of
> country codes, etc.  There are ISO specs, RFCs, POSIX specs (and
> more?)...  Is this information conveniently summarized  anywhere so
> that I may enlighten myself?
Here's a convenient, if perhaps oversimplified, summary.
The standard for two-letter language codes is ISO 639-1.  There is also
an ISO 639-2 (actually, there are two variants) that specifies three-
letter language codes.
The standard for two-letter country codes is ISO 3166-1, which also
specifies collections of three-letter and numeric country codes.  ISO
3166-2 specifies political subdivisions within a country.
RFC 1766 describes a way to use ISO 639-1 and 3166-1 to create language
tags for use on the Internet (e.g. in mail messages).  A lowercase 639-1
language tag can be followed by a hyphen and an uppercase 3166-1 country
code to represent the concept of "language X as spoken in country Y."
Unicode Technical Report #7, "Plane 14 Characters for Language Tags,"
recommends a slight adaptation of the RFC 1766 approach (both codes are
lowercase).
RFC 1766 is currently being revised to allow three-letter (639-2), as
well as two-letter (639-1), language codes.  This will permit the use
of language tags for hundreds of less-common languages that have no two-
letter code.  The revision will also provide ways to use 3166-2 country-
subdivision codes and (draft) ISO 15924 script codes in language tags.
Naturally, the revised version will not be called RFC 1766, but will be
assigned a new number.  I don't know if UTR #7 will be updated to refer
to the new RFC when it is published (I think it should be).
POSIX locale names are also formed from 639-1 language codes and 3166-1
country codes.  Unlike in RFC 1766, the elements are separated by an
underscore rather than a hyphen.  POSIX uses this language/country code
to represent not only the language and local dialect, but all the
attributes of a locale setting, such as decimal separator, thousands
separator, currency symbol, default date format, etc.  It is widely
regarded as inadequate for covering even a reasonable subset of locale
possibilities.
There are other standards for language and country codes, but for our
purposes these are by far the most common.
-Doug Ewell
 Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT