From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 24 2005 - 11:05:57 CDT
I see also in the ISO639-3 list of languages that the input form for
selecting language names lacks an option for languages that don't start by a
letter (and that also don't have ISO 639-1 or -2 codes), notably these ones:
* [oun] !O!ung (Scope=I, Type=L)
* [nmn] !Xóõ (Scope=I, Type=L)
* [alu] 'Are'are (Scope=I, Type=L)
* [kud] 'Auhelawa (Scope=I, Type=L)
* [hnh] //Ani (Scope=I, Type=L)
* [gnk] //Gana (Scope=I, Type=L)
* [xeg] //Xegwi (Scope=I, Type=E)
* [gwj] /Gwi (Scope=I, Type=L)
* [xam] /Xam (Scope=I, Type=E)
* [huc] =/Hua (Scope=I, Type=L)
* [aue] =/Kx'au//'ein (scope=I, Type=L)
They have all Scope=I (Individual language), but Type=L (Living) or Type=E
(Extinct). Is that because they are still aliases, or still not specified
completely (notably their standard English names)? If not, then there shoudl
be a "Other" option in the form input selector.
Also, for "!O!ung" whose reference is found in Ethnologue as a living
language (in the Khoisan family) spoken by a small community in Angola, it
gives also another alias (!O!kung). I'd like to know if these "!" or "/" or
"=" or "'" are used to replace unencoded characters or diacritics, or it's a
technical issue on the Ethnologue.com and SIL.org web sites...
I suspect that "/" means the combining slash overlay, and "//" the combining
double-slash overlay, I suspect the quote to to mean the apostrophe letter,
but what does the equal sign mean? I also suspect that those languages don't
have known orthographies (only spoken for now)...
Are there projects to include in ISO 639-3 the alias names listed by
Ethnologue?
Are there projects to list the ISO 639-3 codes of individual languages
refered by languages with Scope=C (collective languages, such as
"Afro-Asiatic (Other)" whose 639-2 code is "afa", or as "Bihari" whose 639-1
and 639-2 codes are "bh" and "bih", but that won't have ISO 639-3 codes)?
Same question for Scope=M (macrolanguages)?
Or instead to include this reference within the meta-data associated to each
individual language (and so avoiding to change long lists of codes in the
meta-data associated with collective languages.)?
Finally, is ISO 639-3 meant to be used for tagging more precisely the
various written or spoken texts or other localized data? What will be the
relation of ISO 639-3 with BCP 47 (notably will the ISO 639-1 and -2 codes,
when they exist, be still preferable to the ISO 639-3 codes? I think it
should, so that existing documents and localized data won't need to be
updated with new language codes)
I also hope that there's no conflict between 3-letter ISO 639-2 codes and
3-letter ISO 639-3 codes, and that there's already an agreed policy to use
the same codes if possible (excepting for legacy alias codes in ISO 639-2;
let's not renew the difficulties found in ISO 639-1 between technical and
bibliographic codes, or with languages that have had their code changed such
as Hebrew [iw=>he], or Indonedian [in=>id])
This archive was generated by hypermail 2.1.5 : Wed Aug 24 2005 - 11:08:05 CDT