[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #11408(closed: fixed)

Opened 3 months ago

Last modified 2 months ago

Add Qaag for Zawgyi

Reported by: mark Owned by: mark
Component: unknown Data Locale:
Phase: rc Review: pedberg
Weeks: Data Xpath:

Description (last modified by mark) (diff)

Change #1 below is being proposed for the upcoming release of CLDR (v34β on Sept 26).

Following the same model as used for XK (Kosovo), add Qaag to:

  1. the XML validity file as idStatus="special"
  2. the LDML documentation in unicode_script_subtag_validity and Private_Use.

Make it clear that this is a special-purpose code for tagging the migration to Unicode — with very restricted usage. (Wording to be agreed on in committee.)


Change History

comment:1 Changed 3 months ago by mark

  • Description modified (diff)

comment:2 Changed 3 months ago by mark

  • Description modified (diff)

comment:3 Changed 3 months ago by kristi

Instead of rushing this into v34, I'd like to suggest that this be discussed further at UTC first.

  1. One fundamental question that came up on the direction:

Since the goal is to “standardize” Zawgyi so that data and implementations can eventually migrate to Unicode, why not consider a separate, “standard” encoding?
Locale ID won’t help with making fonts that have alternate cmaps; thus, would it be more useful to register “zawgyi” in the IANA charset registry (process defined in RFC 2978 = BCP 19), and then get that supported in SMS and W3C?

  1. There are several points in the proposal that intermingles script/encoding/unicode that should be agreed I think at the UTC level.

“Note that the situation with Zawgji is very much unlike the situation with encodings like Shift-JIS or Latin-1. Unlike those, Zawgji is structured, looks, and behaves just like Unicode except for a small number of characters. So the techniques used for encoding conversion don’t suffice.”

This is making Zawgyi out to be like Unicode. However, it's not Unicode. The biggest difference between Zawgyi and Shift-JIS etc. isn’t the structure of the encodings; rather, the biggest difference is that Shift-JIS etc. are legacy encodings with conventional character-set identities, whereas Zawgyi is not.

comment:4 Changed 3 months ago by jefgen

Regarding point 2 in the ticket description:

  1. Have my-Qaag, my-Qaag-MM, en-Qaag-MM returned as supported locales by ICU (ULocale.getAvailableLocales()).

I thought that the decision was to not have any locale support for Zawgyi?

From the CLDR-TC/Agenda meeting notes:

Agreed with the Qaag approach with documentation explaining the limited support for the transition to Unicode: no collation, no input support, and no locale.

Or perhaps this would just be for private copies of ICU (with patches/modifications)?

comment:5 Changed 3 months ago by mark

Sorry, the statement in the bug severe lagged the email and then physical discussions. Will fix now.

comment:6 Changed 3 months ago by mark

  • Description modified (diff)
  • Summary changed from Add Qaag to Add Qaag for Zawgyi

comment:7 Changed 3 months ago by mark

  • Status changed from new to accepted
  • Priority changed from assess to major
  • Phase changed from dsub to rc
  • Milestone changed from UNSCH to 34
  • Owner changed from anybody to mark
  • type changed from unknown to data

comment:8 Changed 3 months ago by mark

  • Status changed from accepted to reviewing
  • Review set to pedberg

comment:9 Changed 2 months ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.