[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #9771(accepted data)

Opened 19 months ago

Last modified 4 weeks ago

collation index: add beyond-last-reordered-CJK index labels

Reported by: markus Owned by: markus
Component: collation Data Locale: zh
Phase: rc Review:
Weeks: 0.2 Data Xpath:


Most CJK collation tailorings reorder only a subset of the Han ideographs, to limit the data size. Any not-reordered ideograph sorts after the last reordered one.

In a CJK collation index, a not-reordered ideograph shows up in the last CJK index bucket, which is misleading. For example, with the short Chinese stroke order, the two-stroke ideograph 㐅=U+3405 lands in the 48劃 bucket because the short-stroke Han tailoring ends with

               <'\uFDD0\u2830' # INDEX 48
               <*龘 # 48

I suggest that we add another index label at the end of each of the Han tailorings, maybe something like "?劃".

I have not tried if this would work out of the box with ICU, and I am open to other ideas.



Change History

comment:1 Changed 18 months ago by mark

  • Owner changed from anybody to markus
  • Priority changed from assess to major
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 30

comment:2 Changed 18 months ago by pedberg

  • Milestone changed from 30 to 31

Markus says "Old problem, and don’t have data to put there, should look at this for CLDR 31". Moving to 31

comment:3 Changed 13 months ago by markus

  • Milestone changed from 31 to 32

comment:4 Changed 6 months ago by markus

  • Keywords punt32 added

comment:5 Changed 6 months ago by markus

  • Milestone changed from 32 to 33

comment:6 Changed 4 weeks ago by markus

  • Keywords punt33 added
  • Milestone changed from 33 to 34

Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.