[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7246(closed enhancement: fixed)

Opened 3 years ago

Last modified 3 years ago

root collation: remove Cyrillic contractions

Reported by: markus Owned by: markus
Component: uca Data Locale:
Phase: rc Review: mark
Weeks: 0.4 Data Xpath:
Xref:

Description

We suppress most of the Cyrillic contractions in most of the Cyrillic-locale collation tailorings. The contractions make the sorting of Cyrillic base letters slower.

I propose that we remove them from the root collation and add them to tailorings for locales that need them. If the CLDR team agrees, I can also propose this for the DUCET. It would be much easier if we did not have to modify the CLDR root collation for this compared to DUCET.

The following table lists all of the Cyrillic-script CLDR locales.

main locale collation tailoring
az_Cyrl missing (only Latn)
be [АаӘәГгЕеЖжЗзІіОоӨөКкЧчЫыЭэѴѵ]
bg [АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]
bs_Cyrl imports sr
kk [АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]
ky empty/same as root
mk [АаӘәЕеЖжЗзИиІіОоӨөУуЧчЫыЭэѴѵ]
mn missing
os missing
ru [АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]
sah missing
sr [АаӘәГгЕеЖжЗзИиІіОоӨөКкУуЧчЫыЭэѴѵ]
tg missing
uk [АаӘәГгЕеЖжЗзОоӨөКкУуЧчЫыЭэѴѵ]
uz_Cyrl missing

Attachments

Change History

comment:1 Changed 3 years ago by emmons

  • Owner changed from anybody to markus
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 26rc

comment:2 Changed 3 years ago by emmons

CLDR Committee has approved the concept and is in favor for this. Request Markus to make proposal to the UTC.

comment:3 Changed 3 years ago by srl

Should there be a followup ticket to evaluate the missing/empty collations?

comment:4 Changed 3 years ago by markus

  • Milestone changed from 26rc to 27rc

comment:5 Changed 3 years ago by markus

  • Phase set to rc
  • Milestone changed from 27rc to 27

comment:6 Changed 3 years ago by markus

  • Status changed from assigned to reviewing

comment:7 Changed 3 years ago by markus

  • Status changed from reviewing to accepted

comment:8 Changed 3 years ago by markus

New root collation data based on initial UCA 8 DUCET which only removes most of the Cyrillic contractions.

Tailorings adjusted.

Kyrgyz, which had an empty file, now has a tailoring, according to Wikipedia and discussed with a native speaker, Tilek Mamutov (Google). Sample Kyrgyz list of strings showing ё primary-after е:

арбуз
ермак
ёлка
живот

comment:9 Changed 3 years ago by markus

  • Status changed from accepted to reviewing
  • Review set to emmons

comment:11 Changed 3 years ago by markus

  • Review changed from emmons to mark

comment:12 Changed 3 years ago by markus

Integrated into trunk, and corresponding changes are in the ICU trunks as well: IcuBug:11375

comment:13 Changed 3 years ago by mark

  • Status changed from reviewing to closed
  • Resolution set to fixed
View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.