CLDR Ticket #5546(accepted data)
follow DUCET with other numbers among symbols
|Reported by:||markus||Owned by:||markus|
I propose that we follow the DUCET in how we order "other numbers". That is, I propose we stop reordering them from the symbol group into the digit group.
Compared with the DUCET, "CLDR groups the numbers together after currency symbols, instead of splitting them with some before and some after." (see the LDML spec).
There are about 200 "other number" characters that CLDR modifies, for example U+0BF0 ௰ Tamil Number Ten and U+2180 ↀ Roman Numeral 1000 CD. On the DUCET symbol chart they are the characters from 09F4 to 1D371.
CLDR sorts all of these in the "digit" reordering group, just before digit 0. They do not sort in the order of numeric values, they are not digits, and they do not decompose to digits.
With numeric sorting on, and with computed primary weights for numeric sorting at the beginning of the digit group like we defined in LDML 22, the "other number" characters sort between the digits-as-numbers and the compatibility digits.
The current reordering puts all of the characters together that have General_Category=Number, but I do not see that this order is better, in any practical sense, than their DUCET order.
I think it is desirable to reduce the difference between the DUCET and the CLDR root, to reduce surprises for users and to reduce our tooling and documentation burden.
- Owner changed from anybody to markus
- Status changed from new to assigned
- Data Locale set to root
- Type changed from enhancement to data
- Component changed from uca to collation