The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of January 08, 2024, since the previous cumulative document was issued prior to UTC #177 (October 2023).
The links below go directly to open PRIs and to feedback documents for them, as of January 02, 2024..
The links below go to locations in this document for feedback.
Feedback routed to CJK & Unihan Group for evaluation [CJK]
Feedback routed to Script ad hoc for evaluation [SAH]
Feedback routed to Properties & Algorithms Group for evaluation [PAG]
Feedback routed to Emoji SC for evaluation [ESC]
Feedback routed to Editorial Committee for evaluation [EDC]
Other Reports
Date/Time: Mon Nov 06 09:11:54 CST 2023
ReportID: ID20231106091154
Name: Junliang Huang
Report Type: Public Review Issue
Opt Subject: 468
Sorry for commenting on a closed PRI. UTC-03258 ⿱⿰户己金 is unifiable to U+289D2 𨧒 by UCV #121. 𨧒 is a GHZ character used for person name 朱在𨧒, which is the same person in 南明史. I suggest changing its status from FutureWS to Variant. UTC-03259 ⿱山⿰氵乃 is potentially unifiable to U+23CB8 𣲸. UTC-03262 ⿳士艹工 is likely an error of 㙓. The 南明史 evidence mentions that 朱朝⿳士艹工 is 崇禎十三年進士. However, 崇禎十三年庚辰科進士三代履歷 22a and 順治河南通志17:87 gives 㙓 https://iiif.lib.harvard.edu/manifests/view/drs:48757770$714i. Given that there are only two stroke differences between ⿳士艹工 and 㙓 and ⿳士艹工 is not attested in historical evidence. I suggest changing its status from FutureWS to Variant. UTC-03264 ⿱⿰火攵日 is an error of U+24274 𤉴. The original 南明史 evidence mentions that 朱睿⿱⿰火攵日, 字翰之, is a painter. His artwork 疏林遠岫 is available on https://digitalarchive.npm.gov.tw/Painting/Content?pid=5186&Dept=P, which gives his name ⿱炇𣅀. ⿱炇𣅀 is the preferred form of the current encoded 𤉴(⿱⿰火夂𣄼) shape as ⿱炇𣅀 is also attested in 四聲篇海(成化刊本)12:18a and his friend 周亮工's work 書畫錄 1:11b. The current encoded 𤉴 is a G4K character used in 文淵閣四庫全書·御定佩文齋書畫譜, also for his name. Anyway, given that there are only one stroke and 攵/夂 differences between ⿱⿰火攵日 and 𤉴. I suggest changing its status from FutureWS to Variant. UTC-03265 ⿰氵⿱宀隹 (⿰氵寉) is unifiable to U+3D36 㴶 by UCV #304. I suggest changing its status from FutureWS to Variant. UTC-03267 ⿲金⿱火犬頁 is a variant of U+28C25 𨰥. ⿲金⿱火犬頁 is featured in the table of contents, however, if we turn to page 1464, we will see 南明史 p. 1464 gives 𨰥 instead. So ⿲金⿱火犬頁 is a one-off misprint in TOC. I suggest changing its status from FutureWS to Variant. UTC-03276 ⿱⿰氵[H6-03]水 is a variant of U+3D57 㵗. The 南明史 evidence mentions 朱常⿱⿰氵[H6-03]水 is 始興縣令. 乾隆始興縣志8:4 and 道光廣東通志29:23 both give U+3D57 㵗. UTC-03286 ⿱禾氺 is also registered in U+25578 𥝸 + VS18. I suggest changing its status from FutureWS to Variant. That's all.
Date/Time: Thu Nov 09 19:14:06 CST 2023
ReportID: ID20231109191406
Name: Eiso Chan
Report Type: Public Review Issue
Opt Subject: kCantonese for U+3150D
U+3150D 𱔍 is identified as UTC-00420. UAX #45 shows the following information. UTC-00420;NoAction;;30.11;;⿰口兜;kCowles 4114*kMeyerWempe 3029*kCheungBauerIndex 375.08;;; I checked kMeyerWempe 3029 and kCheungBauerIndex 375.08. All of them show the Cantonese reading should be dau3, and it means “cave, den, nest”. We could add the kCantonese property value as below. U+3150D kCantonese dau3 Current only one source reference is UK-10751, so the UAX #45 could be updated as below if possible. UTC-00420;ExtH;UTC-00420;30.11;;⿰口兜;kCowles 4114*kMeyerWempe 3029*kCheungBauerIndex 375.08;;; BTW, the submitted evidence from UK shows the Chinese Hakka-dialect usage.
Date/Time: Fri Nov 10 20:10:43 CST 2023
ReportID: ID20231110201043
Name: Ken Lunde
Report Type: Error Report
Opt Subject: UAX #45 USourceData.txt
The following two UAX #45 ideographs are encoded in Extension H: UTC-00635;Rejected;;64.3;;⿰扌小;kCheungBauerIndex 406.01;;;3 UTC-00911;NoAction;;32.3;;⿰土干;Adobe-CNS1 C+17331;;6; The first record should be changed to the following, which also includes a total stroke count (6): UTC-00635;ExtH;U+317EA;64.3;;⿰扌小;kCheungBauerIndex 406.01;;6;3 The second one is a duplicate of UTC-02997, which is encoded at U+31587, so its record should be changed to the following, which also includes a first residual stroke (1): UTC-00911;UTC-02997;;32.3;;⿰土干;Adobe-CNS1 C+17331;;6;1
Date/Time: Sun Nov 12 19:40:53 CST 2023
ReportID: ID20231112194053
Name: Paul Masson
Report Type: Error Report
Opt Subject: kSemanticVariant for U+9452 鑒
This character already has one semantic variant listed in the database. Another appears to be U+9373 鍳 with one component missing, but the identical pronunciation in both Cantonese and Mandarin. Please update the database accordingly.
Date/Time: Sat Nov 18 10:15:13 CST 2023
ReportID: ID20231118101513
Name: Andrew West
Report Type: Error Report
Opt Subject: CJK Unified Ideographs Extension G code chart
It has recently been reported to me that the glyphs for U+30D91 (UK-02133) and U+30D94 (UK-02134) are not ideal as the 3rd and 4th strokes of the 谷 component on the left side should be joined (as is shown for the G-source characters U+30D93 and U+30D95). I have already supplied Michel with an updated font for these two characters, and request that the glyphs are updated for the Unicode 16.0 code chart.
Date/Time: Fri Nov 24 22:26:09 CST 2023
ReportID: ID20231124222609
Name: K T Shek
Report Type: Error Report
Opt Subject: Unihan_Variants.txt
Unihan_Variants.txt lists U+26552 (𦕒) as a variant of U+8998 (覘) and U+4993 (䦓): --- U+4993 kSemanticVariant U+8998<kMeyerWempe U+26552<kMeyerWempe U+8998 kSemanticVariant U+4993<kMeyerWempe U+26552<kMeyerWempe U+26552 kSemanticVariant U+4993<kMeyerWempe U+8998<kMeyerWempe --- I believe this is not correct. The correct variant of U+8998 (覘) should be U+4021 (䀡) instead of U+26552 (𦕒). The error is probably caused by the very similar printing of the 目 radical and 耳 radical in MeyerWempe (e.g. in P.384-385 the radical 目 in 瞄 and 渺 are very close to a 耳), resulting in the publisher even misinterpreted 䀡 as 𦕒 (P.40, RSIndex P.91). 𦕒 shares no similar meaning to 䦓/覘/䀡. And according to Taiwan’s variant dictionary, 䦓, 覘, 䀡 are variants: https://dict.variants.moe.edu.tw/variants/rbt/word_attribute.rbt?quote_code=QjA0NTg5 But 𦕒 has no variant. https://dict.variants.moe.edu.tw/variants/rbt/word_attribute.rbt?quote_code=QzEwNzUz I understand that it may not be possible to “fix” the reference in kMeyerWempe to U+4021 because it clearly states that the radical is 耳. But as the information provided by the book is in doubt, I still suggest to consider having the kSemanticVariant of U+26552 removed to avoid confusion or spreading of incorrect information. Thank you!
Date/Time: Mon Dec 18 13:45:40 CST 2023
ReportID: ID20231218134540
Name: Lee Collins
Report Type: Error Report
Opt Subject: Unihan_Readings.txt
Two entries have wrong transliteration from kana くりや U+5E96 kJapaneseKun KURYA U+5EDA kJapaneseKun KURYA KURYA should be KURIYA
Date/Time: Sun Jan 07 18:37:44 CST 2024
ReportID: ID20240107183744
Name: Paul Masson
Report Type: Error Report
Opt Subject: kMandarin for U+9244 鉄
This character is a semantic variant of U+9435 鐵. As such it should share the same pronunciation for this use, which is tiě, in addition to the existing pronunciation.
(None at this time.)
Date/Time: Tue Nov 07 14:09:48 CST 2023
ReportID: ID20231107140948
Name: Joe Hildebrand
Report Type: Error Report
Opt Subject: UAX #14
Summary: LB9 is unclear that the CM|ZWJ character is treated as if it does not exist for the purpose of matching subsequent rules LB9 currently states: ``` LB9 Do not break a combining character sequence; treat it as if it has the line breaking class of the base character in all of the following rules. Treat ZWJ as if it were CM. Treat X (CM | ZWJ)* as if it were X. where X is any line break class except BK, CR, LF, NL, SP, or ZW. At any possible break opportunity between CM and a following character, CM behaves as if it had the type of its base character. Note that despite the summary title, this rule is not limited to standard combining character sequences. For the purposes of line breaking, sequences containing most of the control codes or layout control characters are treated like combining sequences. ``` When combined with the new rule LB28a: ``` LB28a Do not break inside the orthographic syllables of Brahmic scripts. AP × (AK | ◌ | AS) (AK | ◌ | AS) × (VF | VI) (AK | ◌ | AS) VI × (AK | ◌) (AK | ◌ | AS) × (AK | ◌ | AS) VF ``` and the following test from line 10287 of https://www.unicode.org/Public/15.1.0/ucd/auxiliary/LineBreakTest.txt: ``` × 1B18 ÷ 1B27 × 1B44 × 200C × 1B2B × 1B38 ÷ 1B31 × 1B44 × 1B1D × 1B36 ÷ # × [0.3] BALINESE LETTER CA (AK) ÷ [999.0] BALINESE LETTER PA (AK) × [28.12] BALINESE ADEG ADEG (VI) × [9.0] ZERO WIDTH NON-JOINER (CM1_CM) × [28.13] BALINESE LETTER MA (AK) × [9.0] BALINESE VOWEL SIGN SUKU(CM1_CM) ÷ [999.0] BALINESE LETTER SA SAPA (AK) × [28.12] BALINESE ADEG ADEG (VI) × [28.13] BALINESE LETTER TA LATIK (AK) × [9.0] BALINESE VOWEL SIGN ULU (CM1_CM) ÷ [0.3] ``` it becomes clear that the 200C in the input (linebreak class CM, affected by LB9), should not just be treated as if it had the linebreak class VI, but should not be included at ALL when trying to match LB28a. When the 200C is treated as VI, the sequence would read: AK VI VI AK, and would NOT match the third line of LB28. When the 200C is ignored entirely, the sequence would read: AK VI AK, and WOULD match the third line of LB28, as the test states. Both of these are potentially-valid readings of the current text in LB9. Before the addition of LB28a, there were no cases I can think of where the difference mattered. In a future version of the spec, the language in LB9 could be clarified to make interoperable implementation easier.
Date/Time: Sat Nov 18 10:43:42 CST 2023
ReportID: ID20231118104342
Name: Mikhail Morozov
Report Type: Error Report
Opt Subject: The Unicode Standard, Version 15.1, Bamum Supplement Range: 16800–16A3F
There is a misspelling in the name of the character 𖠋 (U+1680B) BAMUM LETTER PHASE-A MAEMBGBIEE in https://www.unicode.org/charts/PDF/U16800.pdf. The proposal for encoding Old Bamum script (https://www.loc.gov/rr/amed/pdf/proposal-for-encoding-bamum-script.pdf#page=20) has IPA, English and French transcriptions for the letters, and it seems that the English transcription should be spelled with one B instead of two, MAEMGBIEE. The source for the proposal, L'Écriture des Bamum: sa naissance, son évolution, sa valeur phonétique, son utilisation, by I. Dugast and M.D.W. Jeffreys (https://www.calameo.com/read/000061616e47e713325db) also supports this opinion.
(None at this time.)
Date/Time: Mon Nov 13 07:42:24 CST 2023
ReportID: ID20231113074224
Name: Jakub Jelinek
Report Type: Error Report
Opt Subject: Unicode15.1.0/ch04.pdf
Note: This has already been taken care of by Editorial Committee, and is in the draft for Unicode 16.0.
I believe the Table 4-8. Name Derivation Rule Prefix Strings should have an 2EBF0..2EE5D NR2 “CJK UNIFIED IDEOGRAPH-" line added, corresponding to the CJK Ideograph Extension I, First .. CJK Ideograph Extension I, Last addition in 15.1.
Date/Time: Sun Jan 07 09:40:18 CST 2024
ReportID: ID20240107094018
Name: Charlotte Buff
Report Type: Other Document Submission
Opt Subject: L2/23-193r3: Name of one chemical arrow
U+1F8D6 was accepted for a future update under the name LONG RIGHTWARDS ARROW WITH THROUGH X (cf. 177-C33). The “WITH” is erroneous; the character should be called LONG RIGHTWARDS ARROW THROUGH X. This would also synchronise the name with the existing U+2947 RIGHTWARDS ARROW THROUGH X.
(None at this time.)