[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10037(reviewing data)

Opened 4 months ago

Last modified 4 weeks ago

Additional Japanese exemplar set issues from #9807

Reported by: pedberg Owned by: pedberg
Component: main Data Locale: ja
Phase: dsub Review: yoshito
Weeks: Data Xpath:
Xref:

ticket:9807

ticket:10113

Description

This ticket is a follow-on from cldrbug 9807: to consider some additional isses raied there:

  • There are 30 characters in ja main exemplars not in Joyo kanji. If these are only in CLDR exemplars due to their use in old era names, perhaps they should move to the aux exemplars.
  • It is possible that the use of U+9CEF 鳯 in an old era name (and its consequent including in aix exemplars) is an error, and it should instead be U+9CF3 鳳

Attachments

Change History

comment:1 Changed 4 months ago by yoshito

It is possible that the use of U+9CEF 鳯 in an old era name (and its consequent including in aix exemplars) is an error

It looks this is the case. We see this character only for

<era type="2">白鳯</era>

But I think this is a mistake, and should be

<era type="2">白鳳</era>

https://ja.wikipedia.org/wiki/%E7%99%BD%E9%B3%B3

鳯(U+9CEF) and 鳳(U+9CF3) are same character originally. I suspect U+9CEF came from Chinese national encoding, while U+9CF3 came from JIS. It looks the appearance of U+9CF3 in Japanese pages on Web is much popular than U+9CEF.

So, I agree to change the era name to 白鳳 (U+9CF3) and remove it from Japanese exemplar set.

comment:2 Changed 3 months ago by mark

  • Owner changed from anybody to pedberg
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 32

comment:3 Changed 4 weeks ago by pedberg

  • Status changed from accepted to reviewing
  • Xref changed from 9807 to 9807 10113
  • Review set to yoshito

Here is what I did for this:

  1. In the names for Japanese calendar era 2 (and in the aux exemplars), changed U+9CEF to U+9CF3 (the latter was not already in the exemplar sets)
  2. Then to address the issue of 30 non-Joyo kanji in the main exemplars (http://unicode.org/cldr/trac/ticket/9807#comment:1), I did the following:
    # The following are only used in lunar calendar cyclic names or month patterns;
    # moved them from main to aux exemplars
    U+4E11 丑
    U+4EA5 亥
    U+514E 兎
    U+536F 卯
    U+58EC 壬
    U+5BC5 寅 - was in middle of a main exemplar range
    U+5DF3 巳
    U+5E9A 庚
    U+620A 戊
    U+620C 戌
    U+732A 猪
    U+7678 癸
    U+8FB0 辰
    U+9149 酉
    U+958F 閏
    U+9F20 鼠
    
    # The following are only used in old (pre-Meiji) Japanese era names;
    # moved them from main to aux exemplars
    U+4EA8 亨
    U+5609 嘉
    U+5F18 弘
    U+660C 昌
    U+795A 祚
    U+7984 禄
    U+798E 禎 - was in middle of a main exemplar range
    U+96C9 雉
    
    # The following are not used in the CLDR data at all;
    # just removed from main exemplars.
    U+4F0A 伊
    U+52FA 勺
    U+5301 匁
    U+8139 脹
    U+9291 銑
    U+9318 錘
    
  3. Note that this still leaves the following characters which are used in the CLDR data (lunar calendar cyclic names or locale display names for languages, script, keyword values) but not in any exemplar set (no change from previous status here), we might consider adding them to aux exemplars: [梵 湘 罫 芒 蟄 贛 閩]
  4. Final sorting of exemplar sets and grouping into ranges will still be handled under cldrbug 10113;
View

Add a comment

Modify Ticket

Action
as reviewing
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.