The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of October 26, 2023, since the previous cumulative document was issued prior to UTC #176 (July 2023).
The links below go directly to open PRIs and to feedback documents for them, as of October 26, 2023.
Issue Name Feedback Link 483 Proposed Update UAX #38, Unicode Han Database (Unihan) (feedback) 482 Proposed Draft UTR #56, Unicode Cuneiform Sign Lists (feedback) No feedback at this time 479 Proposed Update UTS #53, Unicode Arabic Mark Rendering (feedback) No feedback at this time
The links below go to locations in this document for feedback.
Feedback routed to CJK & Unihan Group for evaluation [CJK]
Feedback routed to Script ad hoc for evaluation [SAH]
Feedback routed to Properties & Algorithms Group for evaluation [PAG]
Feedback routed to Emoji SC for evaluation [ESC]
Feedback routed to Editorial Committee for evaluation [EDC]
Other Reports
Date/Time: Thu Aug 10 17:21:54 CDT 2023
ReportID: ID20230810172154
Name: Ken Lunde
Report Type: Error Report
Opt Subject: Proposed kRSUnicode property value changes and additions
The following two ideographs are structurally the same as U+9F52 齒, but are missing two strokes: U+2398A 𣦊 U+2EBBD 𮮽 Their kRSUnicode property values are currently as follows: U+2398A kRSUnicode 77.9 U+2EBBD kRSUnicode 211.0 I recommend that they be changed to the following, which involves adding a second property value and changing the number of residual strokes for Radical #211 from 0 to -2: U+2398A kRSUnicode 77.9 211.-2 U+2EBBD kRSUnicode 211.-2 77.9
Date/Time: Sun Jul 30 17:08:21 CDT 2023
ReportID: ID20230730170821
Name: Paul Masson
Report Type: Error Report
Opt Subject: kPhonetic for U+5807
I have been in direct contact with Ken Lunde and we both agree this value should be 574 as appears on p.85 of Casey. I am submitting this feedback to provide tracking for the change.
Date/Time: Wed Oct 11 15:27:13 CDT 2023
ReportID: ID20231011152713
Name: Eduardo Marín Silva
Report Type: Other Document Submission
Opt Subject: Suggestion to change the name of certain provisionally assigned characters
These characters are provisionally assigned: LEFTWARDS ARROW FROM DOWNWARDS ARROW & RIGHTWARDS ARROW FROM DOWNWARDS ARROW. I would agree with this naming if the downwards arrow portion was samller than the leftward/rightwards portion but the opposite is true. Consider 21A6 ↦ LEFTWARDS ARROW FROM BAR, where the "bar" part is smaller;these names make me think of that character but the bar has a down arrow head. I think more intuitive names would be DOWNWARDS ARROW WITH BRANCHING LEFTARDS/RIGHTWARDS ARROW. I also disagree with the name: BENGALI LETTER ALTERNATE BA, considering it's exclusively for Pali/Sanskrit ortographies, a less confusing name could be BENGALI LETTER PALI-SANSKRIT BA. And the regular ba should have a note stating (only represents va in Pali or Sanskrit). I also suggest that the Sidetic block characters be named based on their sound if they have been deciphered and with the NXX otherwise. This would declutter the chart by not having a bunch of informative aliases. Formal aliases can be added if other letters are deciphered after encoding.
Date/Time: Tue Aug 15 12:10:59 CDT 2023
ReportID: ID20230815121059
Name: David Carlisle
Report Type: Error Report
Opt Subject: TR25 UNICODE SUPPORT FOR MATHEMATICS
https://unicode.org/reports/tr25/ Some of the Math Classifications in the MathClass-15 data file associated with TR25 seem incorrect. https://www.unicode.org/Public/math/revision-15/MathClassEx-15.txt 23B0;R;⎰;lmoust;ISOAMSC;;UPPER LEFT OR LOWER RIGHT CURLY BRACKET SECTION 23B1;R;⎱;rmoust;ISOAMSC;;UPPER RIGHT OR LOWER LEFT CURLY BRACKET SECTION 27C5;R;⟅;;;;LEFT S-SHAPED BAG DELIMITER 27C6;R;⟆;;;;RIGHT S-SHAPED BAG DELIMITER These are classified as R (infix relation, TeX \mathrel) when it would seem more appropriate to use O and C (\mathopen \mathclose) which are the assignments currently made by LaTeX. I'm doing a systematic comparison with LaTeX Unicode-math, there are other differences as detailed in this github issue https://github.com/wspr/unicode-math/issues/619#issuecomment-1678594025 However in some of these cases we may choose to change the TeX settings or simply document the differences although for example as listed in that issue, TeX traditionally makes daggers U+2020 an dU+2021 binary operators (B) not relations (R) which would give them more space.
Date/Time: Tue Aug 22 05:57:09 CDT 2023
ReportID: ID20230822055709
Name: Andrew West
Report Type: Error Report
Opt Subject: DerivedNumericValues.txt
For Unicode 15.1, there is a discrepancy between the numeric values for U+5146 and U+79ED as given in DerivedNumericValues.txt (single value only) and Unihan and ucd.xml (two values each): https://www.unicode.org/Public/draft/UCD/ucd/extracted/DerivedNumericValues.txt: 5146 ; 1000000.0 ; ; 1000000 # Lo CJK UNIFIED IDEOGRAPH-5146 79ED ; 1000000000.0 ; ; 1000000000 # Lo CJK UNIFIED IDEOGRAPH-79ED Unihan_NumericValues.txt: U+5146 kPrimaryNumeric 1000000 1000000000000 U+79ED kPrimaryNumeric 1000000000 1000000000000 ucd.nounihan.flat.xml: <char cp="5146" age="1.1" na="CJK UNIFIED IDEOGRAPH-#" JSN="" gc="Lo" ccc="0" dt="none" dm="#" nt="Nu" nv="1000000 1000000000000" .../> <char cp="79ED" age="1.1" na="CJK UNIFIED IDEOGRAPH-#" JSN="" gc="Lo" ccc="0" dt="none" dm="#" nt="Nu" nv="1000000000 1000000000000" .../> The derived numeric value should be based on kPrimaryNumeric: # Derived Property: Numeric_Value # Field 1: # The values are based on field 8 of UnicodeData.txt, plus the fields # kAccountingNumeric, kOtherNumeric, kPrimaryNumeric in the Unicode Han Database (Unihan). # The derivations for these values are as follows. # Numeric_Value = the value of kAccountingNumeric, kOtherNumeric, or kPrimaryNumeric, if they exist; otherwise # Numeric_Value = the value of field 8, if it exists; otherwise # Numeric_Value = NaN However, the format of the file only allows for a single value. My personal opinion is that Numeric_Value should always be a single value, even in cases such as U+5146 and U+79ED where there are alternative interpretations of the numeric value, otherwise implementations which rely on UCD data to apply numeric value (e.g. for numeric sorting) will not know which of the space-separated list of numeric values to apply. My preferred solution would be: 1. Allow multiple alternative numeric values in the Unihan database only (i.e. no change to kPrimaryNumeric for U+5146 and U+79ED); 2. Allow only a single numeric value for Numeric_Value in DerivedNumericValues.txt, selecting the most widely-used modern interpretation for U+5146 and U+79ED, and modifying accordingly the stated derivation for the value given in Field 1; 3. Derive the "nv" value in ucd.xml from Numeric_Value in DerivedNumericValues.txt.
Date/Time: Thu Jul 20 19:15:14 CDT 2023
ReportID: ID20230720191514
Name: Leroy D. Geisse V.
Report Type: Website Problem
[UCD, Index.txt, 16.0]
Opt Subject: Missing character name variant
I think that this is a minor issue. Regards. By searching for "cursor" in the Character Name Index (https://www.unicode.org/charts/charindex.html), I found is not the variant "down, fast cursor". cursor down, fast cursor left, fast cursor right, fast cursor up, fast fast cursor down fast cursor left fast cursor right fast cursor up left, fast cursor right, fast cursor up, fast cursor
Date/Time: Tue Sep 26 05:26:46 CDT 2023
ReportID: ID20230926052646
Name: Henri Sivonen
Report Type: Error Report
Opt Subject: UTS #10
Hi, https://www.unicode.org/reports/tr10/tr10-49.html#Other_Applications_of_Collation has this sentence: “For example, if v and w are treated as identical base letters in Swedish sorting, then they should also be treated the same for searching.” This example has become obsolete. See https://unicode-org.atlassian.net/browse/CLDR-17050 and links backwards from there to issues and CLDR changesets concerning both Swedish and Finnish search collations. (Perhaps it could be mentioned instead that ä and å are primary-distinct from a in Swedish.) Henri Sivonen
(None at this time.)
Date/Time: Sun Sep 17 02:08:23 CDT 2023
ReportID: ID20230917020823
Name: Lim Hian-tong
Report Type: Error Report
Opt Subject: Chapter 18 of the Unicode Standard, Version 15.0.0
Chapter 18 of the Unicode Standard contains information about dialects of Chinese that does not reflect the actual situation, as described below. Please consider rewriting the segments that are apparently incorrect or inaccurate. On page 747, it is claimed that speakers of Chinese languages other than Mandarin learn to read and write Mandarin pronouncing it with the rules of their own language, which, although still practiced in Hong Kong and Macao, does not apply to most parts of the Chinese-speaking world. The majority of Chinese-medium schools in the 21st century teach written Mandarin text with Mandarin pronunciation exclusively, regardless of students’ dialectal backgrounds. The situation metaphorized as having Spanish children pronouncing French text as if it were Spanish hardly happens anymore outside Hong Kong and Macao. Another paragraph on page 747 states that modern Chinese languages are almost never seen in printed form except for Cantonese, which is not the fact. A significant example of various modern Chinese languages in printed form would be the case in Taiwan where lessons of the Taiwanese, Hakka and Matsu languages, conducted with standardized writing systems of these languages, have been made available to every single student for more than two decades. Similar approaches to teach local languages with orthographies based on their respective vernacular forms do exist in certain schools in the PRC as well. Page 765 describes the use of Bopomofo letters for the phonetic representation of southern Chinese dialects as “never fully standardized,” despite the fact that a set of Bopomofo symbols designed for the Matsu dialect is officially in use on the Matsu islands.
Date/Time: Mon Sep 25 10:25:54 CDT 2023
ReportID: ID20230925102554
Name: Philippe Verdy
Report Type: Error Report
Opt Subject: /charts/PDF/U10600.pdf
Note: This has already been corrected in the 16.0 annotations draft for the names list.
The annotations for two Linear A characters seem to be obviously wrong: U+10703 𐜃 LINEAR A SIGN A600 • 10762 𐝢 a802, 10741 𐝁 a702 b U+10704 𐜄 LINEAR A SIGN A601 • 10762 𐝢 a802, 10748 𐝈 a709 l The "base" character described should be U+10764 𐝤 linear A sign A804, instead of U+10762 𐝤 linear A sign A802: U+10703 𐜃 LINEAR A SIGN A600 • 10764 𐝤 a804, 10741 𐝁 a702 b U+10704 𐜄 LINEAR A SIGN A601 • 10764 𐝤 a804, 10748 𐝈 a709 l This bug occurs in the two charts: * /charts/PDF/U10600.pdf * /charts/fr/PDF/U10600.pdf and is also present in the NamesList.txt file in the UCD containing annotations, used to automatically generate these charts: * /Public/UCD/latest/ucd/NamesList.txt 10703 LINEAR A SIGN A600 * 10762 a802, 10741 a702 b 10704 LINEAR A SIGN A601 * 10762 a802, 10748 a709 l which should be: 10703 LINEAR A SIGN A600 * 10764 a804, 10741 a702 b 10704 LINEAR A SIGN A601 * 10764 a804, 10748 a709 l
Date/Time: Mon Oct 16 12:01:45 CDT 2023
ReportID: ID20231016120145
Name: Denny Vrandečić
Report Type: Error Report
Opt Subject: Unicode Standard 15.0
Page 10, Section 2.1 (Architectural Context), Subsection "Text Elements, Characters, and Text Processes", contains the following sentence: "For example, in traditional German orthography, the letter combination “ck” is a text element for the process of hyphenation (where it appears as “k-k”), but not for the process of sorting." "traditional German orthography" seems confusing here, given that this has not been the case since the 1996 orthographic reform, i.e. more than a quarter century ago. I would suggest to change the term to either "pre-1996 German orthography" (to be explicit) or "(old / former / previous /dated) German orthography" (to avoid the potentially loaded term "traditional").
(None at this time.)