[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #5089(closed defect: fixed)

Opened 3 years ago

Last modified 21 months ago

UTS 35 unsuited as a reference for BCP 47 extensions

Reported by: norbert Owned by: mark
Component: xxx-spec Data Locale:
Phase: Review: yoshito
Weeks: Data Xpath:
Xref:

ticket:5101

Description

My perspective is that of the editor of a specification that normatively relies on BCP 47, including the Unicode Locale Extension, but cannot normatively rely on CLDR because some implementors of the specification just can't use CLDR.

From this perspective, UTS 35 is a total mess.

Start with the entry point: RFC 6067, BCP 47 Extension U, points to section 3 of UTS 35. That section is entitled “Unicode Language and Locale Identifiers”. Unicode language and locale identifiers are not BCP 47 language tags; they are traditional ICU locale identifiers with some BCP 47 style enhancements. The section serves primarily to define Unicode identifiers, with various annotations on how they are similar to or different from BCP 47 language tags.

In the midst of this section are two parts that actually seem relevant to RFC 6067:

1) The key/type definitions table. But again, this table is an amalgamation of information relevant to RFC 6067 (references to XML files in the bcp47 directory, keys, types), information relevant only to Unicode identifiers (old key names, old type names), and information whose status is unclear (references to various other sections of UTS 35).

2) Subsection 3.2.1, which defines the canonicalization of Unicode locale extension sequences.

Both of these reference Appendix Q, Unicode BCP 47 Extension Data, which is clearly also relevant to BCP 47, except for the alias attributes, which belong to the world of Unicode locale identifiers and add no value to BCP 47.

Maybe there is more information relevant to the BCP 47 Unicode Locale Extension that I missed. I wouldn't be surprised.

I think it was a mistake to use UTS 35 as the specification for the two BCP 47 extensions - two entirely separate documents should have been created defining them. But splitting UTS 35 into three separate documents now would require updating two RFCs, which is probably more pain than it’s worth.

Therefore, I’d like to propose to restructure section 3 as follows:

  • Start with an explanatory statement “This section consists of three parts: Subsection 3.1 specifies the subtags of the BCP 47 Extension U, Unicode Locale, (RFC 6067). Subsection 3.2 specifies the subtags of the BCP 47 Extension T, Transformed Content, (RFC 6497). Subsection 3.3 specifies the Unicode language and locale identifiers used in CLDR, as well as their relationship to BCP 47.”
  • Then reorganize the section into the three subsections as described above. Start each subsection again with a statement “This subsection specifies …”.
  • Move the key/type definitions table into subsection 3.1, but remove the old key names and old type names, as well as any other material that’s not relevant for BCP 47. Basically, the table should say what keys and types mean, but not how one might implement them using CLDR.
  • Create a new table in subsection 3.3 that maps between BCP 47 extension subtags and their old Unicode locale identifier equivalents.
  • Move the content of subsection 3.2.1 into the new subsections 3.1 and 3.2 as appropriate.
  • Include normative references to Appendix Q in subsections 3.1 and 3.2.
  • Remove the alias attributes from the bcp47 files.
  • Move any other information related to BCP 47 extensions U and T, except that in Appendix Q, into subsections 3.1 and 3.2.

Attachments

Change History

comment:1 Changed 3 years ago by mark

  • Owner changed from anybody to mark
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 22

We are planning a major reorganization of the spec (with migration section to maintain old links). These spec changes sound good for that. If you're interested, we can talk offline.

comment:2 Changed 3 years ago by mark

  • Xref set to 5101
  • Milestone changed from 22 to 22.1

comment:3 Changed 3 years ago by mark

  • Milestone changed from 22.1 to 23dsub

We did not have time during the release to do the major reorganization that we'd planned, and are moving that to the first part of the next cycle (2012Q4).

comment:4 Changed 2 years ago by emmons

  • Milestone changed from 23dsub to 23dres

comment:5 Changed 2 years ago by mark

  • Milestone changed from 23dres to 23

comment:6 Changed 2 years ago by mark

  • Milestone changed from 23 to 23aux

While the situation should have improved with the reorganization of the now Part 1, there is further work to do, so leaving this open for further fixes.

comment:7 Changed 2 years ago by emmons

  • Milestone changed from 23aux to 24

comment:8 Changed 21 months ago by mark

  • Review set to yoshito

Everything that is necessary for reference should be in Section 3.7

http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Locale_Extension_Key_and_Type_Data.

Also fixed http://cldr.unicode.org/index/bcp47-extension

If there are any further changes that need to be made, please file a separate ticket.

comment:9 Changed 21 months ago by yoshito

  • Status changed from assigned to closed
  • Resolution set to fixed

We may still want to improve the organization.. for now, I'll close this ticket.

View

Add a comment

Modify Ticket

Action
as closed
The ticket will be disowned. The resolution will be deleted. Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.