[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #10123(closed data: fixed)

Opened 14 months ago

Last modified 7 months ago

Latin-ASCII should remove Mn marks on digits too

Reported by: pedberg Owned by: pedberg
Component: translit Data Locale:
Phase: rc Review: pedberg
Weeks: Data Xpath:


Many people use a compound transform like "Any-Latin; Latin-ASCII" to produce the best ASCII-range (or mostly ASCII-range) equivalent for arbitrary Unicode text.

Currently, to enable-round-trip mapping, the Arabic-Latin transform maps Persian digits 06F0-06F9 to a combination of the 0-9 ASCII equivalent plus COMBINING MACRON BELOW (to distinguish them from the mapping of 0660-0669 digits). However Latin-ASCII does not remove COMBINING MACRON BELOW following digits, so Persian digits run through "Any-Latin; Latin-ASCII end up with COMBINING MACRON BELOW instead of as plain ASCII-range digits.

This is due to the following line in Latin-ASCII:

[:Latin:] { [:Mn:]+ → ; # maps to nothing; remove all Mn following Latin letter

That should be generalized at least to allow stripping from digits too:

[[:Latin:][0-9]] { 


Change History

comment:1 Changed 13 months ago by emmons

  • Owner changed from anybody to pedberg
  • Phase changed from dsub to rc
  • Priority changed from assess to minor
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 32

comment:2 Changed 7 months ago by pedberg

  • Priority changed from minor to medium
  • Status changed from accepted to reviewing
  • Review set to sascha

comment:3 Changed 7 months ago by sascha

  • Review changed from sascha to pedberg

Looks good. I’ve also added some test cases to prevent future regressions; can you review them?

comment:4 Changed 7 months ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

Test case looks good, thanks for adding it!


Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.