[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10076(accepted data)

Opened 10 months ago

Last modified 3 months ago

Regularize ExtendedPictographic

Reported by: mark Owned by: mark
Component: unknown Data Locale:
Phase: rc Review:
Weeks: Data Xpath:
Xref:

Description

During the CLDR development, the question came up about ExtendedPictographic. We originally formulated that to get around a significant problem in segmentation (character/word/linebreak), and put it into CLDR as a vehicle. It is too late to make any changes right now, but I don't think we want to have the situation remain as it is.

I think the right approach at this point would be to propose something like the following to the UTC in May:

  1. Move Extended_Pictographic into the emoji data files, for the next version after Emoji 5.0 (Emoji 6.0 or perhaps a sooner small update Emoji 5.1, whatever timing is needed). The contents should be the current Extended_Pictographic + Emoji X - Emoji_Component + MALE SIGN + FEMALE SIGN.
  2. After Unicode 10.0, propose modifying the segmentation rules in UAX#14 and UAX#29 based on LDML (updated somewhat):
    • GB11′ [:Extended_Pictographic:] ZWJ × [:Extended_Pictographic:]
    • WB3c′ ZWJ × [:Extended_Pictographic:]
    • LB8a′ ZWJ × (ID | [:Extended_Pictographic:])
  3. Along with #2, add text to both UAX#14 and UAX#29 that
    • The rules for segmentation may use properties outside of the main property associated with the algorithm. In such a case, such properties are indicated with the UnicodeSet notation, such as [:General_Category=Letter:].

Attachments

Change History

comment:1 Changed 10 months ago by mark

  • Owner changed from anybody to mark
  • Priority changed from assess to critical
  • Type changed from unknown to data
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 32

comment:2 Changed 7 months ago by mark

  • Phase changed from dsub to rc

comment:3 Changed 3 months ago by mark

  • Milestone changed from 32 to 33

in progress in UTC

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.