[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #8067(closed: fixed)

Opened 4 years ago

Last modified 4 years ago

Language Matching documentation

Reported by: mark Owned by: mark
Component: xxx-spec Data Locale:
Phase: final Review: pedberg
Weeks: Data Xpath:



We should say to canonicalize the locale ids before processing.

Also, we should rewrite http://www.unicode.org/reports/tr35/#Bundle_vs_Item_Lookup to make it clear that the recommended methodology for Bundle lookup is to use Language Matching.

It would also be clearer if we changed "Resource item lookup" to "Inherited item lookup".

Cf an email reply:

I also want to be clear that there are two closely-related but very different tasks.

  1. Inherited item lookup. Given that you have a CLDR resource bundle, with inheritance, where do I go to get inherited items?

That is specified by CLDR by means of the parentLocale + truncation algorithm, plus the alias element. (There are a few cases where we have "Lateral Inheritance" where the specification is in the text of LDML, such as when looking for an alt variant.)

So back to Rafael's original question:

  1. en-Latn-GB, and zh-TW are not CLDR bundles, so this doesn't apply to them.
  2. en-US-u-nu-usd: the u-nu-usd doesn't select within a bundle, but rather customizes a service that uses information in the bundle. The item lookup (using by the currency formatting service) would be en-US => en => root.
  1. Bundle lookup. Given a locale ID, where do I get the best matching CLDR bundle?

My application has a set of supported locales, and the user comes in with a set of desired locales. What is the best bundle for that user?

Here we are not as clear as we should be. The recommended process is in http://www.unicode.org/reports/tr35/#LanguageMatching

So back to Rafael's original question:

  1. en-Latn-GB, and zh-TW. When these are looked up with Language Matching, assuming that all the CLDR locales are available, they would return, respectively, en-GB and zh-Hant-TW.

That being said, often people don't understand language matching, and so we are in the process of adding more information so that there is a direct mapping from between locale IDs that are always considered to be "identical" on a deep level, like en-GB and en-Latn-GB.


Change History

comment:1 Changed 4 years ago by mark

  • Summary changed from Language Matching to Language Matching documentation

comment:2 Changed 4 years ago by mark

  • Owner changed from anybody to mark
  • Priority changed from assess to major
  • Status changed from new to accepted
  • Component changed from unknown to spec
  • Milestone changed from UNSCH to 27

comment:3 Changed 4 years ago by mark

  • Phase changed from dsub to final

comment:4 Changed 4 years ago by mark

  • Status changed from accepted to reviewing
  • Review set to pedberg

comment:5 Changed 4 years ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.