From: Doug Ewell (dewell@adelphia.net)
Date: Wed Dec 20 2006 - 10:18:02 CST
RE: Question about new locale language tagsMichael Maxwell wrote:
>> What other "problems" of this sort are supposed to be present in ISO
>> 639-3?
>
> There's a long list of cases where 639-2 (not 639-3) had a code for
> something that wasn't a language by a linguistic definition, but
> rather a group of languages (linguistically motivated or not), or
> which was vague, at http://www.sil.org/iso639-3/macrolanguages.asp.
> 'Arabic', for example, is not a single language, but rather a group of
> things ranging from non-mutually intelligible to maybe mutually
> intelligible, together with Modern Standard Arabic (MSA), which is no
> one's native language, but which is understood and spoken by educated
> people across the region. (MSA is also the only standardized written
> form of Arabic, which makes it relevant to tagging text. You can find
> "dialectal" Arabic written, but there is no standard.)
That's not a "long list." There are only 56 macrolanguages out of 7,595
total entries in ISO/FDIS 639-3, and they all follow a similar pattern:
in some contexts they are considered a single language -- even by native
speakers who can't understand each other's "dialect" -- and in some
cases they are considered a group of languages. The "linguistic
definition" just isn't as black-and-white as we would all like.
This is largely off-topic for Unicode; please see the "LTRU" or
"ietf-languages" URLs in my signature block for better venues.
-- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://users.adelphia.net/~dewell/ http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages
This archive was generated by hypermail 2.1.5 : Wed Dec 20 2006 - 10:20:13 CST