From: Doug Ewell (dewell@adelphia.net)
Date: Tue Dec 19 2006 - 23:21:43 CST
Arne Götje (高盛華) <arne at linux dot org dot tw> wrote:
> According to http://www.ethnologue.com/show_country.asp?name=Taiwan
> each of the 26 languages (22 of them living, 15 of them important)
> have all 3 character language codes, for example 'nan' for Minnan,
> 'hak' for Hakka, 'ami' for Amis, ...
>
> Can we use these language codes to form new locales, like 'nan_TW',
> 'hak_TW', 'ami_TW', etc.? Or does anything speak against this
> practice?
Normally it's recommended to wait until ISO 639-3 is published and then
use those codes instead of the Ethnologue codes (which might not match
100%). In the case of these languages, there happens to be 100%
correlation.
IETF language tags (not the same thing as locales) will most likely
implement Hakka, Mandarin, Min Nan, and Tainwan Sign Language as
"extended language subtags," meaning (for example) that Min Nan will be
encoded in language tags as "zh-nan" instead of "nan." As a designer of
locale information, you are probably free to use either the ISO 639-3
code directly or the language tag, at least until a consensus develops
to use one or the other.
Michael Maxwell replied:
> This is presumably ISO 639-2, which had many such problems.
Considering the various Han Chinese languages as a single "Chinese
language" is by no means a unique "problem" of ISO 639-1 and 639-2.
Many people, including the majority of Chinese, share this view.
Since ISO 639-3 incorporates all of the non-collective codes from ISO
639-2, it includes "zho" as a "macrolanguage" encompassing the
individual Han Chinese languages, while still retaining the status of a
language itself. The question of whether Chinese is one language or
several is a complex one, and usually not best understood by dismissing
one view or the others as a "problem."
What other "problems" of this sort are supposed to be present in ISO
639-3?
> ISO 639-3 is based on the Ethnologue codes (with some modifications),
> plus codes for long extinct and made-up languages (including
> everyone's favorite, Klingon).
1. Encoding extinct languages is a design goal for ISO 639-3, not an
error.
2. All languages are "made-up"; they are human inventions and do not
occur in nature. Constructed languages such as Esperanto and Ido are
also present in ISO 639-1 and -2.
3. Klingon is also present in ISO 639-2. It has more speakers than
Thao.
-- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://users.adelphia.net/~dewell/ http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages
This archive was generated by hypermail 2.1.5 : Tue Dec 19 2006 - 23:24:54 CST