[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10801(accepted data)

Opened 7 weeks ago

Last modified 7 weeks ago

Lithuanian: Inconsistency between collation rules and <exemplarCharacters type="index">

Reported by: maiku.fabian@… Owned by: markus
Component: unknown Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description

https://unicode.org/cldr/trac/browser/trunk/common/collation/lt.xml

contains:

&̀=̇̀
&́=̇́
&̃=̇̃
&A<<ą<<<Ą
&C<č<<<Č
&E<<ę<<<Ę<<ė<<<Ė
&I<<į<<<Į<<y<<<Y
&S<š<<<Š
&U<<ų<<<Ų<<ū<<<Ū
&Z<ž<<<Ž

and

https://unicode.org/cldr/trac/browser/trunk/common/main/lt.xml

contains:

<exemplarCharacters type="index">[A Ą B C Č D E Ę Ė F G H I Į Y J K L M N O P R S Š T U Ų Ū V Z Ž]</exemplarCharacters>

I am surprised that the characters Ą, Ę, Ė, Į, Y, Ų, and Ū have only a
secondary (accent) difference to the characters A, E, I, and U in the
collation rules but have their own index bucket.

This seems inconsistent to me.

If Ą has only a secondary difference to A, it should not have its own index
bucket. So either the index bucket should be removed or the collation rule
should be changed to

&A<ą<<<Ą

Attachments

Change History

comment:1 Changed 7 weeks ago by mark

  • Owner changed from anybody to markus
  • Status changed from new to accepted
  • Type changed from unknown to data
  • Milestone UNSCH deleted

Index buckets don't have to align with collation... So if there are specific problems in implementations, that should be raised.

comment:2 Changed 7 weeks ago by pedberg

Debbie Anderson has some contacts at the Lithuanian language institute who may be able to help with this.

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.