[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7598(closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

Problems with 'v' in Latin-Bopomofo (and Latin-NumericPinyin)

Reported by: pedberg Owned by: pedberg
Component: translit Data Locale:
Phase: rc Review: mark
Weeks: Data Xpath:
Xref:

Description

In pinyin, the letter 'v' is officially unused, but is often used as an alternate for 'ü'. Either can have tones; there are composed characters for 'ü' plus all tone marks, but not for 'v' plus all tone marks:

  • 1st, "ǖ" U+01D6, or "v̄" v plus U+0304
  • 2nd, "ǘ" U+01D8, or "v́" v plus U+0301
  • 3rd, "ǚ" U+01DA, or "v̌" v plus U+030C
  • 4th, "ǜ" U+01DC, or "v̀" v plus U+0300

However, the Latin-Bopomofo does not handle 'v' plus tone marks correctly, because its filter currently excludes [:Mn:] (it also does not handle input using fully or partly decomposed 'ü with tone marks). We should to the following:

  1. In Latin-Bopomofo, change the forward filter from ":: [ [:Latin:][1-5] ];" to ":: [ [:Latin:][:Mn:][1-5] ];".
  2. As an efficiency step, we can also add the conversion rule "[ln] { v → ü;" before calling the transform rule ":: Latin-NumericPinyin (NumericPinyin-Latin) ;", then we can eliminate all of the specific one-way conversions for 'v':
    lvan }$pTone    → ㄌㄩㄢ;			# (not in han-latin) one-way, handle v alternate for ü
     lvan           →  ㄌㄩㄢ˙;
    lve }$pTone     → ㄌㄩㄝ;			# one-way, handle v alternate for ü
     lve            →  ㄌㄩㄝ˙;
    lv }$pTone      → ㄌㄩ;				# one-way, handle v alternate for ü
     lv             →  ㄌㄩ˙;
    nve }$pTone     → ㄋㄩㄝ;			# one-way, handle alternate spelling
     nve            →  ㄋㄩㄝ˙;
    nv }$pTone      → ㄋㄩ;				# one-way, handle alternate spelling
     nv             →  ㄋㄩ˙;
    
  3. Finally, in Latin-NumericPinyin’s definition of $vowel, we should change as follows (add 'v', and drop composed 'ü since at this point we a re working with NFD) - adding 'v' does not seem necessary for correct functioning in at least some cases, but it is better for documentation in any case:
    [aAeEiIoOuUüÜ {u\u0308} {U\u0308} ]
    to
    [aAeEiIoOuU {u\u0308} {U\u0308} vV ]
    

Attachments

Change History

comment:1 Changed 3 years ago by emmons

  • Owner changed from anybody to pedberg
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 26rc

comment:2 Changed 3 years ago by pedberg

  • Status changed from assigned to reviewing
  • Review set to mark

comment:3 Changed 3 years ago by mark

  • Status changed from reviewing to closed
  • Resolution set to fixed

comment:4 Changed 3 years ago by markus

  • Phase set to rc
  • Milestone changed from 26rc to 26

comment:5 Changed 3 years ago by pedberg

  • Component changed from data-main to data-translit
View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.