CLDR Ticket #8137(accepted data)
Better mechanism for handling short zone abbreviations
|Reported by:||mark||Owned by:||emmons|
Description (last modified by mark) (diff)
It has become infeasible to handle short zone abbreviations the way we've been doing, since it involves suppressing the parent list, and adding other lists, to a great many regional locales for the "world" languages like English, Spanish, and others that have wide geographical distribution.
That involved duplication of effort, and is inevitably fragile.
I suggest the following mechanism for handling them instead.
In the parent locale, add all the abbreviations that are reasonable for the language in any of the sublocales for zones and metazones. So, en would contain:
<zone type="Pacific/Honolulu"> <short> <generic>HST</generic> <standard>HST</standard> <daylight>HDT</daylight> </short> </zone> ... <zone type="Europe/London"> <short> <daylight>BST</daylight> </short> <exemplarCity>London</exemplarCity> </zone> ... <metazone type="Europe_Central"> <short> <generic>CET</generic> <standard>CET</standard> <daylight>CEST</daylight> </short> </metazone>
Add a new element to <timeZoneNames>, namely:
<retainShortNames regions="US CA">
The interpretation is:
- Get the union of all the zones for the regions, based on the tzdata*. In this case, it would include those for the US and for Canada, eg Pacific/Honolulu, America/Los_Angeles, etc.
- Suppress any zone short names outside of that list.
- Get the union of all the metazones for the regions. In this case, that would include Alaska all the way to Atlantic.
- Suppress any metazone short names outside of that list.
Note that the list could contain a macroregion, like 154 for Northern Europe. In that case, use the containment relations to get the specific regions before generating the unions.
- Using <mapTimezones type="metazones">. Also, we can always add the region for this locale (eg CA for en_CA) automatically, so in the above case we wouldn't actually need to add CA explicitly. If there is no explicit region for this locale, we add the default content locale's region, eg for es we add ES.
With this new element, we can avoid duplicating data across regions, and it becomes simple to add or change the short names that are in use. We wouldn't need any retainShortNames elements for languages whose regions are clustered, like de or da.
We could have 2 special values intended for inheritance that could be included in the regions attribute value:
those would indicate all the regions in that particular locale's continent or subcontinent respectively. Thus if en_001 had <retainShortNames regions="subcontinent GB">, then for en_IN that would automatically produce 034 + GB, which would turn into "AF BD BT IN IR LK MV NP PK GB".
- Owner changed from anybody to emmons
- Priority changed from assess to major
- Status changed from new to assigned
- Component changed from unknown to data-main
- Milestone changed from UNSCH to 28
- Phase changed from dsub to rc
- Milestone changed from 28 to 29