[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #10962(closed: fixed)

Opened 9 months ago

Last modified 7 weeks ago

Transform InterIndic not all converted, need documentation

Reported by: pedberg Owned by: mark
Component: transliteration Data Locale:
Phase: spec-beta Review: pedberg
Weeks: Data Xpath:


From ICU http://bugs.icu-project.org/trac/ticket/13610:

When try to perform transliteration from Gurmukhi to Arabic using the parameter "Gurmukhi-Arabic", I get the result which contains "U+E07C" Unicode character, which belongs to the Unicode "Private Use Area". How to reproduce the issue: try to transliterate "ਸੰਯੁਕਤ ਰਾਜ ਅਮਰੀਕਾ" from Gurmukhi to Arabic (actually it's "USA" in Punjabi, taken from OpenStreetMap, see name:pa at ​https://nominatim.openstreetmap.org/details.php?place_id=177579678). ICU 60.2 PHP 7.0.25-0ubuntu0.16.04.1 Intl 1.1.0

It's the same for Urdu, not just for Arabic: try "Guru-ur".

The use of PUA for a common InterIndic intermediate encoding is intentional but should only be transient and internal. There are two issues here:

  • It seems that some transforms are incomplete and allow the InterIndic PUA codes to leak out.
  • The use of PUA for this purpose is not documented, it probably should be somewhere.


Change History

comment:1 Changed 3 months ago by mark

I agree. We should have a test that fails if there are any interindic left.

comment:2 Changed 3 months ago by mark

  • Owner changed from anybody to mark
  • Milestone changed from UNSCH to 34

comment:3 Changed 7 weeks ago by mark

  • Phase changed from rc to spec-beta
  • type changed from data to spec

Change this to a spec bug, and will file separate ticket to add test.

comment:4 Changed 7 weeks ago by mark

  • Review set to pedberg

Split out all but the documentation to ticket:11449

comment:5 Changed 7 weeks ago by mark

  • Status changed from new to reviewing

comment:6 Changed 7 weeks ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.