[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #8137(accepted)

Opened 4 years ago

Last modified 5 months ago

Better mechanism for handling short zone abbreviations

Reported by: mark Owned by: emmons
Component: to-assess Data Locale:
Phase: rc Review:
Weeks: Data Xpath:

Description (last modified by mark) (diff)

It has become infeasible to handle short zone abbreviations the way we've been doing, since it involves suppressing the parent list, and adding other lists, to a great many regional locales for the "world" languages like English, Spanish, and others that have wide geographical distribution.

That involved duplication of effort, and is inevitably fragile.

I suggest the following mechanism for handling them instead.

In the parent locale, add all the abbreviations that are reasonable for the language in any of the sublocales for zones and metazones. So, en would contain:

			<zone type="Pacific/Honolulu">
			<zone type="Europe/London">
			<metazone type="Europe_Central">

Add a new element to <timeZoneNames>, namely:

<retainShortNames regions="US CA">

The interpretation is:

  1. Get the union of all the zones for the regions, based on the tzdata*. In this case, it would include those for the US and for Canada, eg Pacific/Honolulu, America/Los_Angeles, etc.
  2. Suppress any zone short names outside of that list.
  3. Get the union of all the metazones for the regions. In this case, that would include Alaska all the way to Atlantic.
  4. Suppress any metazone short names outside of that list.

Note that the list could contain a macroregion, like 154 for Northern Europe. In that case, use the containment relations to get the specific regions before generating the unions.

  • Using <mapTimezones type="metazones">. Also, we can always add the region for this locale (eg CA for en_CA) automatically, so in the above case we wouldn't actually need to add CA explicitly. If there is no explicit region for this locale, we add the default content locale's region, eg for es we add ES.

With this new element, we can avoid duplicating data across regions, and it becomes simple to add or change the short names that are in use. We wouldn't need any retainShortNames elements for languages whose regions are clustered, like de or da.


We could have 2 special values intended for inheritance that could be included in the regions attribute value:

  1. continent
  2. subcontinent

those would indicate all the regions in that particular locale's continent or subcontinent respectively. Thus if en_001 had <retainShortNames regions="subcontinent GB">, then for en_IN that would automatically produce 034 + GB, which would turn into "AF BD BT IN IR LK MV NP PK GB".


Change History

comment:1 Changed 4 years ago by mark

  • Description modified (diff)

comment:2 Changed 4 years ago by emmons

  • Owner changed from anybody to emmons
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Component changed from unknown to data-main
  • Milestone changed from UNSCH to 28

comment:3 Changed 4 years ago by emmons

  • Phase changed from dsub to rc
  • Milestone changed from 28 to 29

Pushing to 29 - and I'm still not entirely sure I'm OK with the proposal, as it seems a bit more complicated than I would like.

comment:4 Changed 4 years ago by markus

  • type set to data

comment:5 Changed 4 years ago by srl

  • Status changed from assigned to accepted

comment:6 Changed 4 years ago by emmons

  • Milestone changed from 29 to upcoming

Auto move of all 29 -> upcoming

comment:7 Changed 2 years ago by emmons

  • Milestone changed from upcoming to UNSCH

comment:8 Changed 6 months ago by mark

  • Component changed from main to other

comment:9 Changed 5 months ago by mark

  • Component changed from other to to-assess

Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.