[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #4177(closed defect: fixed)

Opened 4 years ago

Last modified 5 months ago

Non-standard numbering system types taml and tamldec

Reported by: norbert Owned by: mark
Component: xxx-spec Data Locale:
Phase: Review: emmons
Weeks: 0.05 Data Xpath:
Xref:

ticket:4437

Description

The description of numbering systems in UTR 35 says "Four-letter types indicate the decimal numbering system using digits [:GeneralCategory=Nd:] for the script represented in Unicode."

This means that "taml" should indicate the decimal numbering system using Tamil digits, while any non-decimal numbering system for Tamil should use a different type value.

However, further down in the same table, "tamldec" is introduced as "modern Tamil decimal digits", and the CLDR file supplemental/numberingSystems.xml defines "taml" as an algorithmic (non-decimal) numbering system. This seems the opposite of what it should be.

Attachments

Change History

comment:1 Changed 4 years ago by norbert

  • Cc cldrbugs@… added

comment:2 Changed 4 years ago by mark

  • Owner changed from somebody to mark
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 21

Four-letter types indicate the decimal numbering system using digits [:GeneralCategory?=Nd:] for the script represented in Unicode.
=>
Four-letter types indicate the primary numbering system for the corresponding script represented in Unicode. Unless otherwise specified, it is a decimal numbering system using digits [:GeneralCategory?=Nd:].

And document the other cases (hebr, etc.)

comment:3 Changed 4 years ago by norbert

The proposed new wording does not help other specifications that need to be able to determine whether a type represents a decimal or algorithmic numbering system. For the NumberFormat.prototype.format method in the ECMAScript Globalization API spec, we'd like to specify the behavior of a numbering system at least for decimal numbering systems. However, with the proposed new wording the ES-G spec has to list the types of decimal numbering systems explicitly, and cannot predict whether a new type would be decimal or algorithmic.

Consistent use of distinct naming patterns for decimal and non-decimal numbering systems would avoid the need for a list of decimal numbering systems in the ES-G spec. Reserving four-letter types for decimal systems would go a long way, although there also needs to be a pattern for additional decimal numbering systems within a script (e.g., mymr has both Myanmar and Shan digits).

The current version of the ECMAScript Globalization API spec can be found at
http://wiki.ecmascript.org/doku.php?id=globalization:specification_drafts

comment:4 Changed 4 years ago by norbert

The dependent bug against the ECMAScript Globalization API spec:
https://bugs.ecmascript.org/show_bug.cgi?id=227

comment:5 Changed 4 years ago by mark

  • Keywords google added

comment:6 Changed 4 years ago by mark

  • Weeks set to 0.05

comment:7 Changed 4 years ago by mark

  • Review set to yoshito

cldrbug 4177: modified the doc, but also added a test, and modified the number.xml file

comment:8 Changed 4 years ago by mark

  • Cc yoshito added
  • Review changed from yoshito to emmons

comment:9 Changed 3 years ago by emmons

  • Status changed from assigned to closed
  • Resolution set to fixed

comment:10 Changed 3 years ago by norbert

  • Status changed from closed to reopened
  • Resolution fixed deleted

This fix doesn't really address the needs of the ECMAScript Globalization API spec. That spec needs a clear specification on the LDML side that lets its implementors and users easily determine which numbering system codes represent systems that are decimal and use simple digit mappings, and which ones are algorithmic.

The "unless otherwise specified" basically leaves everything open. It doesn't say where and how things would be "otherwise specified".

Looking at the current lists of numbering systems in

common/bcp47/number.xml

and

common/supplemental/numberingSystems.xml

it seems clear that nothing can be derived from the length of the numbering system name: There are several 4-letter codes for algorithmic systems as well as several longer codes for decimal systems. Both groups have grown in CLDR 21. OK.

The file

common/supplemental/numberingSystems.xml

has all the information that the ECMAScript Globalization API spec would need: Its type attribute clearly states whether a numbering system is decimal or algorithmic, and the digits attribute lists the digits for the decimal systems. Unfortunately, this file is just CLDR data; it's not part of the BCP 47 spec, and has no stability guarantee.

The file

common/bcp47/number.xml

is part of the BCP 47 specification because it's referenced by section 3 of the LDML spec, which in turn is referenced by RFC 6067. However, it doesn't have the type and digits attributes of numberingSystems.xml. Instead, it has comments in the description attributes of some elements; these comments aren't specified anywhere that I can see, and they're not on all elements, so they're not really helpful.

The right solution for the ECMAScript Globalization API spec would be to add the type and digits attributes of numberingSystems.xml to common/bcp47/number.xml and document them as part of the spec for "nu" in the Key/Type Definitions table, or in a section referenced from there.

comment:11 Changed 3 years ago by emmons

  • Status changed from reopened to closed
  • Resolution set to fixed

Norbert - please split off a new ticket for any additional post release 21 considerations. This one has to be closed because it has commits against version 21.

comment:12 Changed 5 months ago by srl

  • Xref set to 4437
View

Add a comment

Modify Ticket

Action
as closed
The ticket will be disowned. The resolution will be deleted. Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.