[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #11181(new unknown)

Opened 6 days ago

Last modified 5 hours ago

Historic scripts almost all missing in Locale Display Names / Scripts

Reported by: Marcel Schneider <charupdate@…> Owned by: anybody
Component: main Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

ticket:827

Description

With the exception of Mongolian, historic scripts are not a part of CLDR.

Is extension to all script names scheduled for further releases? Though they may not be a priority.

When asking on the Public ML to move script names from ISO 15924 to CLDR I was told that this makes no sense, or that they are already part of CLDR.

Most of them are missing, however. That makes localizing character pickers hazardous. Consensus-making is seemingly non-obvious, as despite of reported English-French consensus, many discrepancies exist between ISO 15924 and French Code Charts.

Moving this content to CLDR would open it to democratic and transparent survey.

One expert conceded that UIs may display other names than the official French names. UIs are defined in CLDR. I see no point in maintaining divergent lists for use among experts as pet projects rather than to mutualize efforts directed to maintain one single useful list per locale, regardless of self-edicted stability policies (excepted the Unicode alphanumeric identifiers, aka character names).

Users prove not to be interested in stability at the expense of accuracy. CLDR is the best place to maintain user-centered UI elements. So I’ve repeatedly invited all experts via Unicode Public ML, but nobody among the missing people showed up in ST to tackle the job.

I’m ready to help do our best, just want to make sure nobody will blame hidden effort.

Attachments

Change History

comment:1 follow-up: ↓ 6 Changed 6 days ago by srl

  • Xref set to 827

With the exception of Mongolian, historic scripts are not a part of CLDR.

Please give an example of a missing script. Also please note that your coverage level may be set wrong, see http://cldr.unicode.org/index/survey-tool/guide#TOC-Advanced-Features - unless you set the coverage to Comprehensive, some scripts may be hidden if they are not in common use.

Or do you mean that they are untranslated in French? If so they may be due to ticket:827

Moving this content to CLDR would open it to democratic and transparent survey.

Nothing needs to be removed from international standard ISO 15924 https://unicode.org/iso15924/index.html , French content is dealt with in ticket:827


  • In summary, unless you see a script missing from CLDR as a whole (as opposed to simply being not filled in in French), this ticket should be closed as a duplicate of ticket:827

comment:2 follow-up: ↓ 3 Changed 6 days ago by Marcel Schneider <charupdate@…>

Indeed, it was my coverage level not set to Comprehensive. Sorry.

I used "move" mistakenly for "copy" in the idea that the data be maintained in CLDR, and then mirrored back in ISO 15924 in English and French subsets, English names being frozen and French not.

I like your project of placing the source file of the ISO standard in CLDR to streamline maintenance.

So the effort is to be done in CLDR, while ISO 15924 is corrected in an automated way.

Yes please close this ticket.

Thanks.

comment:3 in reply to: ↑ 2 ; follow-up: ↓ 4 Changed 6 days ago by srl

Replying to Marcel Schneider <charupdate@…>:

Indeed, it was my coverage level not set to Comprehensive. Sorry.

Great, Glad it was solved.

I like your project of placing the source file of the ISO standard in CLDR to streamline maintenance.

To be crystal clear, my proposal was to occasionally copy the ISO 15924 registry's text file into CLDR (just as a copy of the IANAsubtag registry is) , so that CLDR tests can make sure no scripts are missing. I propose no change to ISO 15924's registry nor process.

comment:4 in reply to: ↑ 3 Changed 6 days ago by Marcel Schneider <charupdate@…>

Replying to srl:

To be crystal clear, my proposal was to occasionally copy the ISO 15924 registry's text file into CLDR (just as a copy of the IANAsubtag registry is) , so that CLDR tests can make sure no scripts are missing. I propose no change to ISO 15924's registry nor process.

Then the foreseeable discrepancies between CLDR and ISO 15924 won’t be solved unless I take action by soliciting the experts cited on the ML. But since three of them seem to hate Unicode, any synch with CLDR seems unlikely. Now knowing that vendors do use CLDR for UIs, I wonder who is using ISO 15924. If it’s just a pet project, then there is no need to bother updating it.

comment:5 Changed 6 days ago by Marcel Schneider <charupdate@…>

Addendum:
srl wrote (on Unicode Public):

The English is taken from ISO 15924.

That in turn seems to backref to Blocks.txt, so that the script codes are the only value added, and French script names only add an extra level of complexity and maintenance issues as they overlap with CLDR data. That is what I was referring to.

comment:6 in reply to: ↑ 1 Changed 5 hours ago by Marcel Schneider <charupdate@…>

Missing scripts in CLDR as per ISO 15924

Replying to srl:

Please give an example of a missing script.[…]

  • In summary, unless you see a script missing from CLDR as a whole (as opposed to simply being not filled in in French), this ticket should be closed as a duplicate of ticket:827

The following scripts or variants are not yet in CLDR while being listed in ISO 15924:

Aran	161	Arabic (Nastaliq variant)
Cpmn	402	Cypro-Minoan
Dogr	328	Dogra
Gong	312	Gunjala Gondi
Hmnp	451	Nyiakeng Puachue Hmong
Kitl	505	Khitan large script
Kits	288	Khitan small script
Leke	364	Leke
Maka	366	Makasar
Medf	265	Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ)
Nkdb	85	Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba)
Piqd	293	Klingon (KLI pIqaD)
Rohg	167	Hanifi Rohingya
Shui	530	Shuishu
Sogd	141	Sogdian
Sogo	142	Old Sogdian
Wcho	283	Wancho

This list doesn’t include the private use which are not to be listed in CLDR, as per ticket 11194, comment 7.

comment:7 Changed 5 hours ago by Marcel Schneider <charupdate@…>

Edit: This list was generated from CLDR v33, as I couldn’t access the data of v34 for processing.
However several scripts in that list turn out to be actually under survey.

comment:8 Changed 5 hours ago by Marcel Schneider <charupdate@…>

The actual list is somewhat shorter (sorry):

Aran	161	Arabic (Nastaliq variant)
Cpmn	402	Cypro-Minoan
Hmnp	451	Nyiakeng Puachue Hmong
Kitl	505	Khitan large script
Kits	288	Khitan small script
Leke	364	Leke
Nkdb	85	Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba)
Piqd	293	Klingon (KLI pIqaD)
Shui	530	Shuishu
Wcho	283	Wancho

This list results from manually backchecking (in ST) and is therefore error-prone.

View

Add a comment

Modify Ticket

Action
as new
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.