This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Fri Feb 09 14:30:55 CST 2024
ReportID: ID20240209143055
Name: Denis Moyogo Jacquerye
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
The glyph of ੴ U+0A74 GURMUKHI EK ONKAR was updated in Unicode 11.0. See https://www.unicode.org/charts/PDF/Unicode-11.0/U110-0A00.pdf and error report "Error with rendering ੴ (U+0A74)" by Harkeerat Toor in http://www.unicode.org/L2/L2016/16123-pubrev.html. The Unicode Standard 10.0, 11.0 and later versions still have the same text in chapter 12.3 Gurmukhi: > OtherSymbols. The religious symbol khanda sometimes used in Gurmukhi texts is encoded > at U+262C ADI SHAKTI in the Miscellaneous Symbols block. U+0A74 GURMUKHI > EK ONKAR, which is also a religious symbol, can have different presentation forms, which > do not change its meaning. The font used in the code charts shows a highly stylized form; > simpler forms look like the digit one, followed by a sign based on ura, along with a long > upper tail. The statement "The font used in the code charts shows a higly stylized form" has not been true since 11.0. The last sentence could be changed to: "The font used in the code charts shows a simpler form that looks like the digit one, followed by a sign based on ura, along with a long upper tail ; other forms may be highly stylized."
Date/Time: Tue Feb 13 10:18:01 CST 2024
ReportID: ID20240213101801
Name: Max Blechman
Report Type: Public Review Issue
Opt Subject: 497 [SAH]
I am writing today to express my support for certain characters that have been proposed for addition in the Latin Extended-G block, namely those from the Initial Teaching Alphabet. There have been multiple proposals to add the ITA to Unicode, and since it was historically used to publish many children’s books and is still today used for students who have issues with traditional English spelling, its addition to Unicode would be extremely useful for its users, of which there are still many. Thank you for your time and consideration.
Date/Time: Tue Feb 13 19:17:11 CST 2024
ReportID: ID20240213191711
Name: Eiso Chan
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
In this year, the Chinese media use the term “the year of loong” (龙年/龍年) not “the year of dragon”. See https://english.news.cn/20240210/ce190d57cd8a405db28e034ade839063/c.html https://news.cgtn.com/news/2024-01-22/Where-did-China-s-mythic-loong-come-from--1qzMho0EXxm/p.html The term “loong” is more and more common for the Chinese word 龙/龍, which is different from the original meaning of “dragon” in English. It is better to add the annotations both for U+1F409 🐉 and U+1F432 🐲 as below. * also used for loong in Chinese
Date/Time: Tue Feb 13 20:19:17 CST 2024
ReportID: ID20240213201917
Name: Bryndan Meyerholt
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
The character OL ONAL SIGN HODDOND should probably go under the Various signs section instead of the digits section as it appears to be used as a sign/diacritic mark instead of a digit in Ol Onal. Also check the Wikipedia article of Ol Onal, and scroll down until you see an image with the caption Ol Onal Script.
Date/Time: Wed Feb 14 15:25:32 CST 2024
ReportID: ID20240214152532
Name: Norbert Lindenberg
Report Type: Public Review Issue
Opt Subject: 497 [PAG]
The UCD in Unicode 16.0 alpha defines Indic syllabic and positional categories for the Kirat Rai script. The final proposal for Kirat Rai, L2/22-043R, does not provide such data. I don’t think the omission of Indic data in the proposal was an oversight. The proposal states: “The script does not have the rendering complexity of traditional Brahmic scripts (no reordering, no combining marks, and no conjuncts).” This means, a simple visual encoding model, where spacing characters are encoded in left-to-right order, is sufficient for the script and is intended. Indic data, which implies a phonetic encoding order, should not be added. The phonetic encoding model used for most Brahmic scripts and the visual encoding model used for most non-Brahmic scripts are fundamentally incompatible and should never be combined. Even for a very simple script like Kirat Rai, there’s a slight potential for conflicts between the visual and phonetic encoding order based on the Brahmic cluster model of the OpenType Universal Shaping Engine: Any sequence of characters with gc=Lm (of which Kirat Rai has five) would become part of a single cluster and would have to be encoded in primarily phonetic order. Any data for Kirat Rai should be removed from IndicSyllabicCategory.txt and IndicPositionalCategory.txt.
Date/Time: Wed Feb 14 17:11:11 CST 2024
ReportID: ID20240214171111
Name: Karl Pentzlin
Report Type: Error Report
Opt Subject: UnicodeStandard-15.0.pdf [EDC]
Table 22-4 "Compatibily digits" (p. 862) Line "Circled digits", column "Code Range(s)" should be "24EA, 2460..2468" instead of "24EA, 2080..2089"
Date/Time: Fri Feb 16 13:51:49 CST 2024
ReportID: ID20240216135149
Name: Charlotte Buff
Report Type: Other Document Submission
Opt Subject: Bidi class of Nabla variants [PAG]
I propose changing the Bidi_Class value of the following characters to Other_Neutral from their current value Left_to_Right: U+1D6C1 MATHEMATICAL BOLD NABLA U+1D6FB MATHEMATICAL ITALIC NABLA U+1D735 MATHEMATICAL BOLD ITALIC NABLA U+1D76F MATHEMATICAL SANS-SERIF BOLD NABLA U+1D7A9 MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA U+2207 NABLA has Bidi_Class=Other_Neutral, so its font variants should share the same property value. This is how it already works for U+2202 PARTIAL DIFFERENTIAL and its respective font variants, all of which are Other_Neutral.
Date/Time: Fri Feb 16 14:01:04 CST 2024
ReportID: ID20240216140104
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: 497: Ideographic property value of U+18CFF [PAG]
U+18CFF KHITAN SMALL SCRIPT CHARACTER-18CFF currently has the property Ideographic=No, but the value should be Yes like with all other Khitan Small Script characters.
Date/Time: Sat Feb 17 10:54:36 CST 2024
ReportID: ID20240217105436
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
Considering that the KP-source glyph KP1-3413 of U+4E17 丗, which was added in Unicode 15.1, is identical to the G-source glyph GKX-0077.13 of U+2000D 𠀍 rather than other representative glyphs of U+4E17 丗, it might be a good idea if Unicode moved KP1-3413 from U+4E17 丗 to U+2000D 𠀍.
Date/Time: Sun Feb 18 00:40:09 CST 2024
ReportID: ID20240218004009
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 497 [EDC, RMG]
Note: This issue has been fixed in draft as of 2024-02-27.
As the page *Proposed New Characters: The Pipeline* shows, 8 Standardized Variation Sequences of 4 characters in the block *General Punctuation* have been accepted for Unicode and appeared in the Unicode 16.0 Alpha Code Charts. However, this was not reflected in the Unicode 16.0 Delta Code Charts. As a comparison, there were several SVSes in the block *CJK Symbols and Punctuation* and *Halfwidth and Fullwidth Forms* introduced in Unicode 12.0, and the codepoints affected were all listed under the part *Glyph and Variation Sequence Changes* in the Unicode 12.0 Delta Code Charts. Therefore, I recommend that Unicode explicitly list all the codepoints related to newly added SVSes in the Unicode 16.0 Delta Code Charts.
Date/Time: Sun Feb 18 07:23:51 CST 2024
ReportID: ID20240218072351
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
According to IRG N2276 by Jaemin Chung, there used to be some "pseudo-G8 characters" in URO. The problem was partially solved in Unicode 13.0 by changing their kIRG_GSource values into self-referring GU sources. However, there are still two problems related to these "pseudo-G8 characters": 1. Characters like U+8980 覀, U+7CA6 粦, U+4E85 亅, U+5570 啰 should not be changed to GU sources. Unlike other "pseudo-G8 characters", these characters do exist in GB 8565.2-88 and are, hence, expected to have a G8 source. The reason why N2276 also mentioned these characters is that their kIRG_GSource and kGB8 values are wrong. As Tianheng Shen wrote in IRG N2542, "It seems that these characters do not have a normal G-source, as if they are not used in the mainland of China." However, the fact is that U+5570 啰 is listed as a level 1 character in China's 通用规范汉字表 (Table of General Standard Chinese Characters), which means that it is a common character in China. Therefore, I suggest that the kIRG_GSource values of these four characters be changed as follows, based on N2276: - U+8980 覀: GU-08980 -> G8-2F7A - U+7CA6 粦: GU-07CA6 -> G8-2F7B - U+4E85 亅: GU-04E85 -> G8-2F7C - U+5570 啰: GU-05570 -> G8-2F7D 2. N2276 also requested to modify the kGB8 values of some characters. Unfortunately, this was not realised in Unicode 13.0. I recommend removing the kGB8 values in the range 0883-0894, 1201-1294, and 1351-1394 from the Unihan database and correcting the kGB8 values of the following five characters, based on N2276: - U+9B25 鬥: 0893 -> 1589 - U+8980 覀: 1589 -> 1590 - U+7CA6 粦: 1590 -> 1591 - U+4E85 亅: 1591 -> 1592 - U+5570 啰: 1592 -> 1593
Date/Time: Wed Feb 21 20:00:45 CST 2024
ReportID: ID20240221200045
Name: fantasai
Report Type: Error Report
Opt Subject: Unicode 15.1 U+1F??? [EDC, ESC]
When reviewing some tests, I was told that Unicode and the ESC intends that the family sequences constructed from gendered people symbols should be deprecated and rendered equivalently to the new gender-neutral sequences, _with the intent that users no longer perceive any differences among these encoding sequences_. If that is the expectation, then the UTC should a) document this intent and their equivalence in Chapter 22 (Symbols), not just in dated memos from ESC to UTC b) capture this canonicalization in mapping tables as appropriate If some implementations treat the gendered forms as distinct and others don't, this can create interop problems. And if users are intended to not perceive any differences among these sequences, then they shouldn't encounter any during search, collation, etc. either. ~fantasai
Date/Time: Fri Feb 23 14:41:37 CST 2024
ReportID: ID20240223144137
Name: Diggory Hardy
Report Type: Error Report
Opt Subject: TR9 [PAG]
In TR9, version 15.1.0, section 3.3.3 - http://www.unicode.org/reports/tr9/#Preparations_for_Implicit_Processing It is implied that rules X1-X8 assign embedding levels to characters based only on the paragraph level and explicit formatting tokens, but that these levels will soon be adjusted based on characters' "implicit bidirectional types". X9 does not mention adjusting characters' levels. X10, point 1 does not either. It does however imply that level runs should already have been calculated, and thus that character embedding levels should already have been adjusted. Furthermore, I do not see any explanation of the calculation of embedding levels, only examples. Is it possible that this part of the specification got lost in a re-organisation? By the way, I do not find the mixture of prose, algorithms and examples used in this article the easiest to follow, but do not have strong suggestions (only that specifications usually do not bother discussing optimisations which may be applied to implementations).
Date/Time: Wed Feb 28 05:27:30 CST 2024
ReportID: ID20240228052730
Name: Aditya Bayu Perdana
Report Type: Error Report
Opt Subject: Unicode Standard version 15.0.0 chapter 17 [EDC,SAH]
Referring to UTN #51, the Balinese script section of Unicode Standard version 15.0.0 chapter 17 https://www.unicode.org/versions/Unicode15.0.0/ch17.pdf needs to be updated in some aspects: [editorial change] page 716-717. The so-called Sasak characters are relatively recent creations that have not gained common currency. This should be explicitly mentioned. page 719-720. The section of musical symbols should refer to UTN#51 for more information [technical change] page 717, table 17.3. There’s no reference outside the Unicode Standard and proposal L2/05-008 for the conjunct forms of the Sasak characters, so it’s totally unclear where table 17.3 comes from and whether these conjunct forms were ever used anywhere. The proposal itself says “[The Sasak characters] conjunct forms remain to be verified”. As far as we know, they have not been verified in the 19 years since then. The table should be removed.
Date/Time: Thu Feb 29 10:20:19 CST 2024
ReportID: ID20240229102019
Name: Elliott Hughes
Report Type: Error Report
Opt Subject: Unicode15.0.0/ch18.pdf [EDC]
Table 18-3's Korean column says "ci" rather than the usual "ji" for earth, and "swu" rather than the usual "su" for water. seems weird to use Yale romanization here but then the modern revised romanization in the algorithm to convert precomposed characters to their names?
Date/Time: Sun Mar 03 22:12:23 CST 2024
ReportID: ID20240303221223
Contact: eisoch@126.com
Name: Eiso Chan
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
The J-Source glyphs for U+2011E 𠄞 (JMJ-030462), U+2011F 𠄟 (JMJ-030463) and U+20120 𠄠 (JMJ-030464) are designed for Sung/Ming style. The Moji Joho Kiban database pages show they are really Han characters/Kanji, so the J glyphs should be updated.
Date/Time: Sun Mar 10 04:57:25 CDT 2024
ReportID: ID20240310045725
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 497 [CJK, EDC]
In Version 16.0 ALPHA REVIEW of the Code Charts: https://www.unicode.org/Public/draft/UCD/charts/CodeCharts.pdf The V-Source glyph (V1-6C40) for U+99D5 駕 appears to be defective (incomplete horse component 馬); it is probably based on the glyph defined in the Vietnamese font Nom Na Tong v4.6, which has been corrected since v4.8 (currently v5.09).
Date/Time: Sun Mar 10 10:42:50 CDT 2024
ReportID: ID20240310104250
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 497 [EDC, SAH]
The glyphs of U+1E899 MENDE KIKAKUI SYLLABLE M172 MBOO 𞢙 and U+1E89A MENDE KIKAKUI SYLLABLE M174 MBO 𞢚 seem to be erroneous. The block Mende Kikakui was encoded based on the proposal WG2 N4167 (L2/12-023) replacing N4133R (L2/11-301R), N3863 (L2/10-252) and N3757 (L2/10-006). In N3757 and N3863, U+1E899's current glyph 𞢙 was named MENDE SYLLABLE MBO-2, while U+1E89A's current glyph 𞢚 had the name MENDE SYLLABLE MBOO-2 — both were consistent with the evidence provided. However, the glyphs of U+1E899 and U+1E89A have been incorrect since N4133, which could be a mistake caused by a change in naming principles (N4133 renamed these characters). Therefore, I recommend Unicode swapping the glyphs of U+1E899 and U+1E89A to conform with the original evidence. That is all. (Thanks to my friend 黑之圣雷 for pointing this issue out to me)
Date/Time: Wed Mar 27 09:17:53 CDT 2024
ReportID: ID20240327091753
Name: Charles Lawrence Riley
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
I have reviewed the information on Garay as presented in PRI #497, and it looks clear and accurate to me. Thank you for all the work that you have done on this.
Date/Time: Wed Mar 27 15:58:04 CDT 2024
ReportID: ID20240327155804
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
KP1-5653 is mapped to U+720B 爋 (⿰火勳), but its glyph is actually the same as U+24455 𤑕 (⿰火⿱⿰熏灬力灬). Therefore KP1-5653 should be moved to U+24455.
Date/Time: Thu Mar 28 01:45:49 CDT 2024
ReportID: ID20240328014549
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
The KP-source glyph KP1-83F7 of U+96DF 雟, which was added in Unicode 15.1, exactly matches all the representative glyphs of U+5DC2 巂, rather than other representative glyphs of U+96DF 雟. Therefore, I suggest moving KP1-83F7 from U+96DF 雟 to U+5DC2 巂.
Date/Time: Tue Apr 02 04:43:38 CDT 2024
ReportID: ID20240402044338
Name: Marc Lodewijck
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
Within the "Egyptian Hieroglyphs" (13000–1342F) and "Egyptian Hieroglyphs Extended-A" (13460-143FF) blocks, the colon sign is consistently preceded by one or sometimes two spaces in comments (starting with an asterisk). In English, there should be no space before a colon. Here are a few EXAMPLES, out of a total of 4,212 occurrences: * classifier sitting : ḥmsꞽ * logogram (to hide) : ꞽmn * phonemogram : ḫnms Two spaces before the colon (all instances): * classifier rage, fury : ḳnd * phonemogram : ꜥꜣb * phonemogram : ḫsf * phonemogram : wsr * phonemogram : ꜥḥꜥ * phonemogram : psḏ * phonemogram : rs-wḏꜣ * phono-repeater : sḫt * phonemogram : mnḫ * phonemogram : tꜣ * classifier astronomical instrument : mrḫ.t * phonemogram : ḫnm
Date/Time: Tue Apr 02 06:06:00 CDT 2024
ReportID: ID20240402060600
Name: Marc Lodewijck
Report Type: Public Review Issue
Opt Subject: 497 [EDC]
Below are my findings regarding the presence of surplus spaces within the Unikemet.txt file; some of these have implications for the NamesList.txt file. 1/ The value (third field) of the following line begins with a space and contains two consecutive spaces: U+13CA1 kEH_FVal p & nst (i.e., U+13CA1[tab]kEH_FVal[tab][space]p[space][space]&[space]nst) Consequently, in the NamesList.txt file: 13CA1 EGYPTIAN HIEROGLYPH-13CA1 * phonogram : p & nst 2/ The values in the following lines each contain two consecutive spaces: U+13055 kEH_Func Logogram weaver or nurse U+13489 kEH_Func Classifier to totter U+138D0 kEH_Func Logogram/phonemogram (whom truth/Maat loves) U+13B91 kEH_Func Logogram (to distinguish) and (beginning, front) U+13D04 kEH_Func Classifier divinity (Nekhbet) These double spaces are reflected in the NamesList.txt file: 13055 EGYPTIAN HIEROGLYPH B005A * logogram weaver or nurse : ? | mnḫ.t 13489 EGYPTIAN HIEROGLYPH-13489 * classifier to totter : mss 138D0 EGYPTIAN HIEROGLYPH-138D0 * logogram/phonemogram (whom truth/Maat loves) : mr(.y)-mꜣꜥ.t 13B91 EGYPTIAN HIEROGLYPH-13B91 * logogram (to distinguish) and (beginning, front) : ṯnꞽ ḥꜣ.t 13D04 EGYPTIAN HIEROGLYPH-13D04 * classifier divinity (Nekhbet) : nḫb.t 3/ In several dozen lines, the values in the third field contain one or two consecutive spaces, yet with no impact on the NamesList.txt file — here are a few EXAMPLES: U+13047 kEH_Desc Foreign man, with a bushy beard, standing, wearing a long dress, with the arms hanging at either side of the body. U+133F8 kEH_Desc A geometrical circle U+136CA kEH_Desc The king, seated on heel, both knees down, with a long straight beard, uraeus and coif/long wig, back bend forward, arm forward, hand at the hight of the waist, holding a cup or vessel (W10). 4/ 199 lines conclude with one or more consecutive trailing space characters. Enumerating all of them is impractical; however, here are some EXAMPLES: U+1300F kEH_Func Classifier rebel/enemy U+1316D kEH_FVal sꜣ U+131CE kEH_Func Phonemogram U+13229 kEH_FVal ꜥnḏ.ty U+1331F kEH_Desc A harpoon-head with two horizontal strokes on top and an angled stroke below a curl as point. Consequently, these surplus spaces appear in the NamesList.txt file: Line 38075: * classifier rebel/enemy Line 38813: * logogram (son) : sꜣ Line 39247: * logogram (9th nome of UE) : ꜥnḏ.ty Line 40532: * classifier human being (poor man) : šwꜣ.w Line 40536: * logogram (vocative interjection) : ꞽ Line 40600: * logogram (to fraternize) : snsn Line 40664: * logogram (bowing down) : ḫꜣb/ksw Line 41163: * classifier rebel/enemy Line 41175: * logogram (chiefs) : wr Line 41179: * classifier enemy/rebel (Xerxes) : ḫšryš Line 41237: * logogram (foreigner) : ḫꜣs.ty Line 41304: * logogram (Harsomtus) : ḥr-smꜣ-tꜣ.wy Line 41306: * logogram (Harsomtus) : ḥr-smꜣ-tꜣ.wy Line 41308: * logogram (to sing) : ḥsꞽ Line 42011: * logogram (Maat and Amon) : mꜣꜥ.t & ꞽmn Line 42044: * logogram (to drive away) : sḥrꞽ Line 42128: * logogram (Re) : rꜥ Line 42149: * logogram (eye of Horus) : ꞽr.t-ḥr Line 42382: * logogram (the Nile/the flood) : hꜥpy Line 42472: * phonemogram (first person for sbk-šmꜥ-nfr) : ꞽ Line 42572: * logogram/phonemogram (lady) : nb.t Line 42587: * logogram (rejoicing) : nhm Line 42746: * logogram (together with \C98 and \C43A, representing the triad of Dendera) : ꞽwn.t Line 42802: * logogram (hour) : wnw.t Line 43639: * logogram (given life like Re) : dꞽ-ꜥnḫ-mꞽ-rꜥ Line 43693: * logogram/Phonemogram (great one (female)) : wr.t Line 43708: * logogram (Hermopolis magna, 15th nome of UE) : wnw.t Line 46015: * logogram temple) : gs.w-pr.w 5/ In a large number of lines (42 instances, excluding the one already mentioned in point 1), the value (third field) starts with a space character, which does not affect the NamesList.txt file — here is one EXAMPLE: U+131FA kEH_Desc A crescent moon with a part of the moon disc. Please note that after the tabulation character, there is a space included as part of the line's value (third field): U+131FA kEH_Desc[tab][space]A crescent moon with a part of the moon disc.
Date/Time: Wed Apr 03 09:02:27 CDT 2024
ReportID: ID20240403090227
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: 497 [CJK]
Another incorrect KP mapping: KP1-4D4C (⿰木⿰糸䏍) currently maps to U+3BDE (⿰木絹), but should map to U+23693 (⿰木⿰糹䏍).