CLDR Ticket #6901(closed enhancement: fixed)
Document recommended replacements with multiples
|Reported by:||mark||Owned by:||mark|
Description (last modified by mark) (diff)
(split from ticket:6785
In any event, the text in http://www.unicode.org/reports/tr35/#BCP_47_Language_Tag_Conversion must be fixed to have the right algorithm for replacement, so that it handles 'sh' properly. The key is that while the original base language will always be changed by languageAlias, any other existing subtags will not. So "sh-TR" => "sr-Latn-TR", but "sh-Cyrl" => "sr-Cyrl".
We sometimes have multiple replacements:
<territoryAlias type="SU" replacement="RU AM AZ BY EE GE KZ KG LV LT MD TJ TM UA UZ" reason="deprecated"/> <!-- Union of Soviet Socialist Republics -->
Right now, the first listed one is the most likely, in the absence of other information. However, we could specify a slightly more sophisticated replacement algorithm for territory replacement.
- If there is a single territory in the replacement, use it.
- Otherwise, look up the most likely territory for the base language code (and script, if there is one).
- If that likely territory is in the list, use it.
- Otherwise, use the first territory in the list.
Thus, for example "hy-SU" (Armenian as used in the Soviet Union) becomes "hy-AM" (Armenian as used in Armenia).
This is not a high priority fix, but that could change it