[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #8259(accepted data)

Opened 3 years ago

Last modified 3 years ago

Fix issues in exemplars data

Reported by: emmons Owned by: mark
Component: other Data Locale:
Phase: rc Review:
Weeks: Data Xpath:



Follow up to http://unicode.org/cldr/trac/ticket/8003#comment:4

When we did the normalization of the exemplars data, a few things fell out that shouldn't have. Need to investigate and fix these.


Change History

comment:1 Changed 3 years ago by mark

Copying issues here for easy reference:

Should also run this by Lorna.

In the languages that originally listed the quotation mark, that character is removed—since it is a punctuation mark!
such as http://unicode.org/cldr/trac/changeset/11072/trunk/exemplars/main/aak_Latn.xml

Other languages are marked as using modifier letter:
such as http://unicode.org/cldr/trac/changeset/11072/trunk/exemplars/main/apd_Latn.xml

Or the U+A78C ( ꞌ ) LATIN SMALL LETTER SALTILLO (~18 locales)
such as http://unicode.org/cldr/trac/changeset/11072/trunk/exemplars/main/aom_Latn.xml

  1. I strongly suspect that the languages marked as using the quotation mark are in error, that they should be changed to be the MODIFIER LETTER APOSTROPHE. That will cause it to be preserved in the exemplar characters
  2. I'm also suspicious that at least some of the languages using the SALTILLO should also use the MODIFIER LETTER APOSTROPHE instead.
  3. Also, in http://unicode.org/cldr/trac/changeset/11072/trunk/exemplars/main/kmu_Latn.xml we see:

<exemplarCharacters draft="unconfirmed">[a e f h i k l m {mp} n {nt} o p s t u v y ꞌ {ꞌk} {ꞌm} {ꞌn} {ꞌv} {ꞌy}]</exemplarCharacters>

  1. It is normally superfluous to list both ꞌ freestanding, and in combination with others. The only special case would be if the combination has very different behavior (just like we might have [c h {ch}] if the collation order is different). Thus normally:
  2. If it only occurs before k, m, n, v, y, then the freestanding version should be removed.
  3. If it occurs in arbitrary combination, then the specific versions {ꞌk} {ꞌm} {ꞌn} {ꞌv} {ꞌy} should be removed
  1. The same happens with trunk/exemplars/main/avu_Latn.xml and others for MODIFIER LETTER APOSTROPHE
  2. http://unicode.org/cldr/trac/changeset/11072/trunk/exemplars/main/lmp_Latn.xml has the modifier letter in the exemplars and the U+A78B ( Ꞌ ) LATIN CAPITAL LETTER SALTILLO in the index. One or the other is surely wrong.

comment:2 Changed 3 years ago by emmons

  • Status changed from new to assigned
  • Component changed from unknown to data-other
  • Priority changed from assess to medium
  • Phase changed from dsub to rc
  • Milestone changed from UNSCH to 28
  • Owner changed from anybody to mark

comment:3 Changed 3 years ago by markus

  • Type set to data

comment:4 Changed 3 years ago by markus

  • Component changed from data-other to other

comment:5 Changed 3 years ago by srl

  • Status changed from assigned to accepted

comment:6 Changed 3 years ago by mark

  • Milestone changed from 28 to 29

comment:7 Changed 3 years ago by emmons

  • Milestone changed from 29 to upcoming

Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.