[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #5749(accepted data)

Opened 4 years ago

Last modified 19 months ago

Separate Han-Latin for traditional

Reported by: pedberg Owned by: pedberg
Component: translit Data Locale:
Phase: rc Review:
Weeks: Data Xpath:
Xref:

Description

Currently we have a single Han-Latin transform, based on the first (or only) reading in the kMandarin field of the Unihan database. In some cases, a character may have a different reading for traditional Chinese than for simplified Chinese; in that case, a second entry can be added to the kMandarin field with the distinct traditional reading.

We need to be able to support transforms for both. This ticket lists several characters that have different readings in traditional Chinese. I will propose these for addition to the Unihan kMandarin field. In the meantime we should add a transform that supports them.

An issue is naming. I propose the following.

  1. Add zh-Latin (or zh_Hans-Latin?) as an alias for the current Han-Latin
  2. Add a zh_Hant-Latin transform that uses the special mappings noted here, then calls Han-Latin for the rest

We will likely need to do something similar for Han-Latin/Names, i.e.

  1. Add zh-Latin/Names as an alias for Han-Latin/Names
  2. Add a separate zh_Hant-Latin/Names

Here are the separate mappings for zh_Hant (in some cases, as noted, these may be appropriate for zh as well, and Unicode might consider just changing the primary kMandarin mapping):

	    kMandarin
code  char  curr   add    notes           
u4FFE   俾   bǐ     bì   
u5085   傅   fu     fù   
u5256   剖   pōu    pǒ   
u527D   剽   piāo   piào   
u535C   卜   bo     bǔ   
u5575   啵   bo     bō   
u55F2   嗲   diǎ    diē   
u5638   嘸   fǔ     wǔ   
u5660   噠   dā     dá   
u5730   地   de     dì   
u5824   堤   dī     tí   
u5C76   屶   dao    huì    Both zh and zh_Hant
u5E06   帆   fān    fán   
u5E49   幉   die    dié    Both zh and zh_Hant
u63FC   揼   beng   bèng   Both zh and zh_Hant: [http://www.zdic.net/zd/zi/ZdicE6Zdic8FZdicBC.htm]
u64D8   擘   bāi    bò   
u6597   斗   dòu    dǒu   
u6753   杓   biāo   sháo   
u67CF   柏   bǎi     bó   
u6921   椡   dao    dào    Both zh and zh_Hant
u6928   椨   fu     fǔ     Both zh and zh_hant: [http://www.zdic.net/zd/zi/ZdicE6ZdicA4ZdicA8.htm]
u69DD   槝   dao    dǎo    Both zh and zh_Hant
u6A00   樀   dī     dí   
u6C93   沓   dá     tà   
u7538   甸   diān   diàn   
u7582   疂   die    dié    Both zh and zh_Hant
u7730   眰   diè    dié    Both zh and zh_Hant
u79A3   禣   fu     fù     Both zh and zh_hant: [http://www.zdic.net/zd/zi/ZdicE7ZdicA6ZdicA3.htm]
u7E43   繃   běng   bēng   
u800A   耊   diè    dié    Both zh and zh_Hant
u8019   耙   bà     pá   
u8260   艠   deng   dēng   Both zh and zh_Hant
u8345   荅   dā     dá     Both zh and zh_hant: [http://www.zdic.net/zd/zi/ZdicE8Zdic8DZdic85.htm]
u8584   薄   báo    bó   
u8984   覄   fu     fù     Both zh and zh_hant: [http://www.zdic.net/zd/zi/ZdicE8ZdicA6Zdic84.htm]
u8ADE   諞   piǎn   pián   
u8AF7   諷   fěng   fèng   
u8DCC   跌   diē    dié   
u8E63   蹣   pán    mán   
u8E6C   蹬   dēng   dèng   
u90FD   都   dōu    dū   
u915C   酜   fu     fū     Both zh and zh_Hant: [http://www.zdic.net/zd/zi/ZdicE9Zdic85Zdic9C.htm]
u91B1   醱   fā     pò   
u9642   陂   bēi    pí   
u9666   陦   dao    dǎo    Both zh and zh_Hant
u9684   隄   dī     tí   
u9817   頗   pō     pǒ   
u9AEA   髪   fà     fǎ   
u9AEE   髮   fà     fǎ   
u9BB2   鮲   fu     fú     Both zh and zh_hant: [http://www.zdic.net/zd/zi/ZdicE9ZdicAEZdicB2.htm]
u9E83   麃   páo    biāo   

Attachments

Change History

comment:1 Changed 4 years ago by emmons

  • Owner changed from anybody to pedberg
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 24

comment:2 Changed 4 years ago by pedberg

  • Milestone changed from 24final to 24rc

comment:3 Changed 4 years ago by pedberg

  • Milestone changed from 24rc to 25rc

comment:4 Changed 3 years ago by pedberg

  • Milestone changed from 25rc to 26rc

As of Unicode 6.3 (and even 7.0) the Unihan database does not have any second (traditional-specific) readings in the kMandarin field, need to get that addressed before we can fix this bug

comment:5 Changed 3 years ago by pedberg

  • Cc ake.persson@… added

Note that most of the changes to kMandarin proposed above for "Both zh and zh_Hant" are also included (along with other proposed changes) in Åke Persson's 2013-05-27 document to UTC, "Proposed changes in Unihan kMandarin field for 571 Han characters". This applies to the following:

5C76
5E49
6921
6928
69DD
7582
79A3
800A
8260
8984
9666
9BB2

comment:6 Changed 3 years ago by pedberg

  • Keywords Apple13277421 added

comment:7 Changed 3 years ago by pedberg

Also add separate traditional readings as shown below for the following?
略 7565 lüè
矩 77E9 jù
识 8BC6 shì
匙 5319 chí

Last edited 3 years ago by pedberg (previous) (diff)

comment:8 Changed 3 years ago by claireho

Peter got the list from me in comment #7. Please ignore the proposal for 矩 77E9 jù.
The data in CLDR is correct. Sorry for the confusion.

comment:9 Changed 3 years ago by pedberg

  • Milestone changed from 26rc to 27rc

comment:10 Changed 3 years ago by markus

  • Phase set to rc
  • Milestone changed from 27rc to 27

comment:11 Changed 2 years ago by pedberg

  • Milestone changed from 27 to 28

See latest version of proposal at ​http://www.unicode.org/cgi-bin/GetMatchingDocs.pl?L2/15-036. Awaiting UTC resolution.

comment:12 Changed 2 years ago by markus

  • Type changed from enhancement to data

comment:13 Changed 2 years ago by srl

  • Status changed from assigned to accepted

comment:14 Changed 21 months ago by pedberg

  • Milestone changed from 28 to 29

comment:15 Changed 19 months ago by emmons

  • Milestone changed from 29 to upcoming

Automatic move of all 29 -> upcoming

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.