From: Peter Constable (petercon@microsoft.com)
Date: Wed Mar 01 2006 - 12:53:49 CST
> From: africa-bounce@unicode.org [mailto:africa-bounce@unicode.org] On
Behalf Of
> Donald Z. Osborn
> Quoting John Cowan <cowan@ccil.org>:
Hmmm... evidently some msgs from this list aren't getting past our spam
filters.
> Also I note that the locale form needs language code and country code.
Not
> trying to make arguments here, but to understand how best to use the
> system and
> all the various codes.
Keep in mind that a locale is different from a language. Don't confuse
the need to use a country ID to reflect a regional dialect or spelling
differences ("language" identification) with the need to include a
country ID to reflect processing parameters associated with a country
such as default currency (locale identification). Language distinctions
are always part of a locale, so when a country ID is needed for language
distinctions the language ID can look the same as a locale ID. But
there's a logical distinction: locale IDs generally include a country ID
since locales generally have some country-based data, but not all
language tags require a country ID.
> > Work on RFC 3066ter, which will incorporate ISO 639-3 tags, has not
yet
> > formally begun. The intention of most of the various players,
however, is
> > to use a design in which a language encompassed by a 639-3
macrolanguage
> > will have a two-part language subtag, of the form zh-yue
(Cantonese).
> > So 639-3 code elements for languages that are *not* macrolanguages
will
> > be added directly, but code elements like yue will not: yue will
only
> > exist in Internet language tags as part of the compound subtag
zh-yue.
>
> Thanks for this clarification. Actually the "nesting"of the '3 codes
> under a '1
> or a '2 code makes a lot of sense. Two questions:
> 1) Can one file a locale before 3/15 using this format "ff-ffm-ML"
even though
> the design is not yet oficial?
If you mean file a locale into CLDR, that's a question for the CLDR
list, not this list.
> Beyond that I see that there may be a lot of discussion on the roles
> and use of
> the different codes in the case of different (macro)languages. In teh
case of
> Arabic, for example, would a simple ar-EG be enough or would you need
(or
> alternatively want to rule out) ar-arz-EG (arz=Egyptian spoken
Arabic), while
> at the same time allowing perhaps that less widely spoken dialects in
the
> country be noted?
Standard Arabic is used across Arabic-speaking countries and is
generally the preferred variety for text. This is what would almost
certainly be used in Arabic locale data. Thus, ar-EG is probably the
most appropriate for this case. If someone is specifically using a
locale for creating and working with content or resources in arz, then
ar-arz-EG might be an appropriate locale -- but note, it would be a
different locale than ar-EG.
> But today, if we were filing two locales for Kpelle, what would be the
best
> coding? I'm assuming that kpe-LR annd kpe-GN would be the best (or
least bad)
> choices even if later the xpe and gkp have to be added?
Again, a question for the CLDR list.
> So another question (sorry these are accumulating) is what kpe-xpe-LR
and
> kpe-gkp-GN locales would offer to a group localizing for Kpelle "kpe"
as a
> transborder, multidialect (macro)language?
At this point, I think that's a question for the language communities to
decide, not us.
4. Going back to ISO-639 in general (I know this subject has been
discused before but please bear with me), is there going to be any kind
of feedback between the processes of developing locales and localization
on one hand and amending the list of ISO-639 codes on the other? I
recall there being some mention of a block on new ISO-69-1 and 2 codes,
or that a 1 code will not be given where there is a 2 code, but that
*maybe* a new 1 and 2 code could be given (Runyakitara might be a
candidate for the latter). Also mention of possible additional ISO-639
codes beyond the three ranges already. What is the latest on all this?
> >> 4. Going back to ISO-639 in general [...]
> >> What is the
> >> latest on all this?
> >
> > I think, but I am not sure, that no new 639-1 codes can be added
after
> > 639-3 goes into effect. (In principle, a language missed by 639-3
could
> > be added simultaneously to -1, -2, and -3, but the chance that such
a
> > language both has been missed and meets the criteria for -1 is
small.)
The JAC loosely committed not to add something to -1 that was already in
-2. (I say "loosely" meaning that they did not rule out the possibility
that circumstances might change in the future mandating a need for a new
alpha-2 where an alpha-3 already existing in -2.) The JAC has never made
a similar commitment wrt -3. But, we were just recently discussing the
future of -1, and while this specific concern wrt -3 didn't come up, we
were thinking that we should further constrain -1 so that requests to
add alpha-2 would no longer be accepted from anybody but could only come
from an ISO member body. This would really reduce the number of requests
we get for alpha-2 IDs.
> > Any 639-3 language could be added to 639-2, using the same code
element
> > for it in both parts of the standard.
639-2 will become a subset of the union of 639-3 and 639-5 (the latter
for collections); there will be a single alpha-3 code space. The
criteria for inclusion in 639-2 is likely to get further constrained
from what it is now. In effect, 639-2 will become a profile of alpha-3
of interest to a particular user community; the TC46 reps to the JAC
will be working on a proposal for how we define that user community.
> I'm thinking that language change, planning, and engineering would
call
> for some
> flexibility on this...
There's no question that the plane of language varieties will change,
especially in developing nations as language planning and development
activities bring greater standardization and stabilization of languages.
This will be one of the challenges we face in language coding, and
perhaps also in software implementations. One thing to keep in mind is
that something like a software localization has potential to be a
significant factor in how the sociolinguistic scenery evolves.
Peter Constable
This archive was generated by hypermail 2.1.5 : Wed Mar 01 2006 - 12:59:56 CST