[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #9300(accepted data)

Opened 2 years ago

Last modified 2 months ago

Reduce the number of supported BCP47 variants

Reported by: mark Owned by: mark
Component: bcp47 Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:

Description (last modified by mark) (diff)

There are very few BCP47 variants that are in use, and unfortunately the ietf-languages@… has followed a particular strategy that causes a pointless proliferation of variants. So I suggest that we winnow down the variants to a small number that we think are in reasonably widespread use, or have CLDR locales, and archive the rest of the data.


The beauty of languages is that a combination of two words can be used to denote a narrower concept than either one of them alone. We can say "red book", "red apple", "red heart" without having to have multiple words for "red": "redbk book", "redappl apple", "redhrt heart". Unfortunately, the ietf-languages@… has chosen to follow the latter approach is necessary for lstr variants, which is especially silly given that variant subtags are always used in conjunction with a primary language subtag.

Variants like "eastern" and "western" could have been defined: reusable variants with independent meaning, and then used generatively such as *hy-eastern for Eastern Armenian (rather than hy-arevela), but also *yi-eastern for Eastern Yiddish, and many others. Instead, we get single-use terms like hy-arevela and hy-arevmda, which require special translations in CLDR rather than simply "Armenian (Eastern)". (After all, where there is a single term in a language for a combination of subtags, CLDR does support having such translations.)

Given the current ietf-languages@… policies, it does not make sense for us to support non-generative variants, except in limited circumstances outlined above.


Change History

comment:1 Changed 2 years ago by mark

  • Description modified (diff)

comment:2 Changed 2 years ago by emmons

  • Status changed from new to accepted
  • Component changed from unknown to bcp47
  • Priority changed from assess to medium
  • Milestone changed from UNSCH to 30
  • Owner changed from anybody to mark
  • Type changed from unknown to data

comment:3 Changed 2 years ago by mark

The committee agreed to drop all the translations for variants that don't correspond to CLDR locales. The paths will be dropped from CODE_FALLBACK, so translators won't be able to add them even in comprehensive.

We will document that they have been dropped, but people could recover from v29 if necessary.

comment:4 Changed 20 months ago by mark

  • Milestone changed from 30 to 31

comment:5 Changed 16 months ago by mark

  • Phase changed from dsub to rc

comment:6 Changed 15 months ago by mark

  • Cc pedberg added
  • Phase changed from rc to dsub
  • Milestone changed from 31 to 32

Bumping to 32, since this does not make a functional change. Nice to remove the surplus data, but not a priority for this release.

comment:7 Changed 8 months ago by mark

  • Milestone changed from 32 to 33

comment:8 Changed 2 months ago by mark

  • Milestone changed from 33 to 34

Need to do in DSUB for v34


Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.