Accumulated Feedback on PRI #508

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Sun Sep 22 05:21:48 CDT 2024
ReportID: ID20240922052148
Name: KWAN Ching Kit
Report Type: Public Review Issue
Opt Subject: 508

The kCantonese property value of 貎 is currently maau6, which is likely due
to confusion with 貌. It should be corrected to ngai4, aligning with the
pronunciation of its variant 猊. The kMandarin property value ní and the
kFanqie 五稽 also justify this.

Date/Time: Mon Sep 23 03:52:56 CDT 2024
ReportID: ID20240923035256
Name: Shieru Asakoto
Report Type: Public Review Issue
Opt Subject: 508

Propose to add a secondary kRSUnicode value "61.6" to U+2BB7B 𫭻

Currently U+2BB7B 𫭻 only has a single kRSUnicode value "32.7".

Originally the V-source of the character, V4-445A, was submitted as ⿰土忌,
hence the 土 radical and the current kRSUnicode value "32.7"; however in
WS2017 another similar characters WS2017-01282(V-F1DCA) & WS2017-01283
(GDM-00068) ⿱圮心 with proposed kRSUnicode value "61.6" were submitted, and
IRG decided that the original source V4-445A did not match the evidence and
the WS2017 glyphs should be the current ones, hence these submissions were
unified to U+2BB7B and the V-source font of V4-445A was modified to ⿱圮心.

The structure of the character U+2BB7B has been changed to have ⿱圮心 where 土
is no longer as apparent as 心 for radical, and although not yet
horizontally extended by China, the WS2017 submissions shows 心 radical for
⿱圮心. Furthermore in the WS2017 submissions, although the evidence shows
ji4, implying ⿰土忌, as its Mandarin reading, an alternative reading pi3,
implying ⿱圮心, is also given. As a result, I suggest adding a secondary
kRSUnicode value "61.6" to U+2BB7B 𫭻 to reflect the above unification and
modification.

Thanks.

Date/Time: Mon Sep 23 05:13:18 CDT 2024
ReportID: ID20240923051318
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 508

UAX #38 mentions the CJKRadicals.txt data
file <https://www.unicode.org/Public/UCD/latest/ucd/CJKRadicals.txt>,
which should be updated to be consistent with the use of apostrophes after
the radical number described in the kRSUnicode property.

# CJK radical numbers match the regular expression [1-9][0-9]{0,2}\'{0,2}
# and in particular they can end with one or two U+0027 ' APOSTROPHE characters.

should be:

# CJK radical numbers match the regular expression [1-9][0-9]{0,2}\'{0,3}
# and in particular they can end with one, two, or three U+0027 ' APOSTROPHE characters.

Date/Time: Mon Sep 23 22:50:59 CDT 2024
ReportID: ID20240923225059
Name: KWAN Ching Kit
Report Type: Public Review Issue
Opt Subject: 508

The kCantonese property value of 拼 should be modified from ping1 to ping3 to reflect 
modern usage, e.g. 粵拼 jyut6 ping3.

Date/Time: Sat Sep 28 13:54:36 CDT 2024
ReportID: ID20240928135436
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: 508

For consistency, when the 𫇦 component appears in an ideograph, Radical #140 
should be its primary or secondary radical, in terms of its kRSUnicode property 
value. The following adjustments should be made to accommodate this policy:

1) The following ideographs already have two kRSUnicode property values, but 
because the IRG counts 𫇦 as six strokes, the 5 in the second property value 
should be changed to 6:

U+8314 茔	kRSUnicode	140.5 32.5 → 140.5 32.6
U+8365 荥	kRSUnicode	140.6 85.5 → 140.6 85.6
U+8366 荦	kRSUnicode	140.6 93.5 → 140.6 93.6
U+8367 荧	kRSUnicode	140.6 86.5 → 140.6 86.6
U+83B9 莹	kRSUnicode	140.7 96.5 → 140.7 96.6
U+8426 萦	kRSUnicode	140.8 120.5 → 140.8 120.6

Their kTotalStrokes property values should also be adjusted as follows:

U+8314 kTotalStrokes 8 → 9
U+8365 kTotalStrokes 9 → 10
U+8366 kTotalStrokes 9 → 10
U+8367 kTotalStrokes 9 → 10
U+83B9 kTotalStrokes 10 → 11
U+8426 kTotalStrokes 11 → 12

2) The following ideographs should have a second kRSUnicode property value 
(corrections to existing property values are shown in parentheses):

U+52B3 劳	kRSUnicode	19.5 + 140.4 (change 19.5 to 19.6)
U+83BA 莺	kRSUnicode	140.7 + 196'.6
U+8424 萤	kRSUnicode	140.8 + 142.6
U+848F 蒏	kRSUnicode	140.9 + 164.6
U+84E5 蓥	kRSUnicode	140.10 + 167.6
U+44BF 䒿	kRSUnicode	140.6 + 130.6
U+44E8 䓨	kRSUnicode	140.8 + 121.6
U+26B2E 𦬮	kRSUnicode	140.4 + 10.6
U+26B6C 𦭬	kRSUnicode	140.5 + 50.6
U+26B9C 𦮜	kRSUnicode	140.6 + 68.6
U+26F89 𦾉	kRSUnicode	140.13 + 196.6
U+30300 𰌀	kRSUnicode	38.6 + 140.5
U+3095E 𰥞	kRSUnicode	109.6 + 140.7
U+309B7 𰦷	kRSUnicode	112.6 + 140.7
U+30C21 𰰡	kRSUnicode	140.3 + 1.6
U+30D2F 𰴯	kRSUnicode	149.6 + 140.9
U+32056 𲁖	kRSUnicode	147'.6 + 140.6

3) The following are missing kTraditionalVariant/kSimplifiedVariant 
relationships in the Unihan database:

U+848F 蒏 & U+919F 醟
U+44BF 䒿 & U+818B 膋
U+26B6C 𦭬 & U+2210B 𢄋
U+26B9C 𦮜 & U+23088 𣂈
U+26F16 𦼖 & U+258FB 𥣻
U+30C21 𰰡 & U+8499 蒙
U+32056 𲁖 & U+89AE 覮

That is all.

Date/Time: Tue Oct 15 22:32:21 CDT 2024
ReportID: ID20241015223221
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508

Currently, U+34A8 㒨 and U+20457 𠑗 have the same G-source representative
glyph. What is more, there seems to be a mismatch between the kHanYu and
kIRGHanyuDaZidian properties of these two characters:

- U+34A8 㒨
  - kHanYu: 10239.060
  - kIRGHanyuDaZidian: 10239.050
- U+20457 𠑗
  - kHanYu: 10239.050
  - kIRGHanyuDaZidian: 10239.060

I have checked p. 239 of 《汉语大字典》 (1st ed.), and the 6th character exactly
matches the G-source glyph. However, the 5th character is more similar to
the other glyphs of U+34A8 㒨, with its upper right part identical to the
J-source glyph and its lower right part identical to the T-source glyph.

Besides, since the kIRG_GSource property of U+34A8 㒨 is G5-3329, I have also
checked GB 7590—87, the simplified counterpart of GB5, and the character
placed at 19-09 (0x3329) is not the same as the G-source glyph. The current
G-source glyph has a lower right part of 巳, while the GB4 glyph has a lower
right part of 㔾.

Therefore, the solution seems to be modifying the G-source glyph of U+34A8 㒨
to match the 5th character on p. 239 of 《汉语大字典》 (1st ed.), changing the
kIRG_GSource property of U+34A8 㒨 into GHZ-10239.05, and altering the
kHanyu properties of U+34A8 㒨 and U+20457 𠑗.

To summarise, this feedback suggests a glyph change to the G-source of
U+34A8 㒨, plus the following changes to the Unihan database:

- U+34A8 㒨
  - kIRG_GSource: G5-3329 -> GHZ-10239.05
  - kHanYu: 10239.060 -> 10239.050
- U+20457 𠑗
  - kHanYu: 10239.050 -> 10239.060

That is all.

Date/Time: Tue Oct 15 22:45:36 CDT 2024
ReportID: ID20241015224536
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508

The G-source representative glyph of U+20057 𠁗 seems to be erroneous.

The kIRG_GSource property of this character is GHZ-10027.03. However, the
3rd character on p. 27 of 《汉语大字典》 (1st ed.) has a glyph which is identical
to the T- and J-source glyphs of U+20057 𠁗.

Therefore, I request a change to the G-source glyph of U+20057 𠁗 to match
with its other representative glyphs.

That is all.

Date/Time: Wed Oct 16 03:12:10 CDT 2024
ReportID: ID20241016031210
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508

The current T-source glyph of U+277B0 𧞰 is different from all its other
representative glyphs, and is identical with all the representative glyphs
of U+2D7CD 𭟍.

Considering that the radical of U+277B0 𧞰 is 145 ⾐, I suggest either
changing the T-source glyph of U+277B0 𧞰 to take the radical ⻂, or move the
T-source glyph to U+2D7CD 𭟍.

That is all.

Date/Time: Wed Oct 16 03:40:11 CDT 2024
ReportID: ID20241016034011
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508

The kIRG_GSource property of U+5829 堩 is GE-346D. However, in GB/T
16500—1998, the character placed at 20-77 (0x346D) has the same glyph as
U+21377 𡍷.

Therefore, I suggest that the kIRG_GSource property of U+5829 堩 be changed
to GKX-0233.26.

That is all.

Feedback above this line has already been reviewed during UTC #181 in November, 2024.

Date/Time: Fri Nov 15 01:08:10 CST 2024
ReportID: ID20241115010810
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508

The kRSUnicode property value of U+24822 𤠢 should be 94.9 instead of 94.10 
to match the kTotalStrokes property value of 13.

Date/Time: Thu Nov 21 07:05:01 CST 2024
ReportID: ID20241121070501
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 508

- The kTotalStrokes property of the following CJK ideographs made of two repeated 
components is obviously incorrect; each numeric value should be even, not odd:

U+56CD	kTotalStrokes	21
U+22035	kTotalStrokes	5
U+225F0	kTotalStrokes	7
U+22A92	kTotalStrokes	7
U+23D0F	kTotalStrokes	9
U+24934	kTotalStrokes	9
U+25749	kTotalStrokes	17
U+2699A	kTotalStrokes	17
U+28C8D	kTotalStrokes	15
U+28E19	kTotalStrokes	5
U+29A86	kTotalStrokes	19

should be:

U+56CD	kTotalStrokes	24
U+22035	kTotalStrokes	6
U+225F0	kTotalStrokes	8
U+22A92	kTotalStrokes	8
U+23D0F	kTotalStrokes	10
U+24934	kTotalStrokes	10
U+25749	kTotalStrokes	18
U+2699A	kTotalStrokes	16
U+28C8D	kTotalStrokes	16
U+28E19	kTotalStrokes	4
U+29A86	kTotalStrokes	18

- Likewise, for these CJK ideographs made of multiple repeated components:

U+264C8	kTotalStrokes	17
U+264CB	kTotalStrokes	23

should be:

U+264C8	kTotalStrokes	18
U+264CB	kTotalStrokes	24

- Accordingly, these kRSUnicode properties should also be corrected:

U+22035	kRSUnicode	49.2
U+28E19	kRSUnicode	170.3
U+29A86	kRSUnicode	188.10

should be:

U+22035	kRSUnicode	49.3
U+28E19	kRSUnicode	170.2
U+29A86	kRSUnicode	188.9 188.10

Date/Time: Tue Dec 10 11:32:52 CST 2024
ReportID: ID20241210113252
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508


Currently, U+51F5 凵 and U+2F81D 凵 share the same T-source representative glyph. 
However, the glyph of U+2F81D 凵 does not match its source reference value of T5-2129, 
which has a glyph identical to U+20674 𠙴, as seen on the 全字庫 CNS11643 website. 
Additionally, the glyph of U+2F81D 凵 also matches U+20674 𠙴 in the 異體字字典 released 
by Taiwan's Ministry of Education.

In WG2 N2270, the glyph of U+2F81D 凵 was the same as it is today, while the glyph of 
U+51F5 凵 was similar to U+20674 𠙴. However, the T-source glyph of U+51F5 凵 has 
remained unchanged since the very first version of ISO/IEC 10646. Furthermore, the 
Kangxi index for U+2F81D 凵 was 0134.400, which also matches U+20674 𠙴. Clearly, the 
glyphs of U+51F5 凵 and U+2F81D 凵 were confused in WG2 N2270, leading to the mistaken 
unification of T5-2129 to U+51F5 凵 instead of U+20674 𠙴, and the encoding of U+2F81D 凵.

Therefore, I suggest moving T5-2129 with a corrected glyph to U+20674 𠙴, and revising 
the kIRG_TSource property of U+2F81D 凵 to TU-2F81D.

That is all.

Reference:
- 字形資訊 - [凵] 2-2123 - 全字庫 CNS11643 (2024). https://www.cns11643.gov.tw/wordView.jsp?ID=139555
- 字形資訊 - [凵] 5-2129 - 全字庫 CNS11643 (2024). https://www.cns11643.gov.tw/wordView.jsp?ID=336169
- [N00506] 凵 - 教育部《異體字字典》 臺灣學術網路十四版(正式七版)2024. https://dict.variants.moe.edu.tw/dictView.jsp?ID=115808
- [C00513] 凵 - 教育部《異體字字典》 臺灣學術網路十四版(正式七版)2024. https://dict.variants.moe.edu.tw/dictView.jsp?ID=79366
- ISO/IEC JTC1/SC2/WG2 N2270 - Updated CJK Compatibility Ideographs sets from TCA. https://www.unicode.org/wg2/docs/n2270.pdf

Date/Time: Thu Dec 12 21:03:17 CST 2024
ReportID: ID20241212210317
Name: Eiso Chan
Report Type: Error Report
Opt Subject: KP0-FBFD


Current KP0-FBFD and KP1-8833 glyphs are the same. KP0-FBFD doesn’t match other sources.

Date/Time: Wed Dec 25 20:52:50 CST 2024
ReportID: ID20241225205250
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508


The kRSUnicode property value of U+2C0D1 𬃑 should be 75.9 instead of 75.8 to match the kTotalStrokes property value of 13.

References:
- IRG N1279-A - Viet Nam CJK_C Remainders Submitting to IRG. https://drive.google.com/uc?id=1lBt-u0F-7D4KbAFBR7XusFwskMvpFL0k&export=download
- 03067 | ⿱𦍌𫶧 | WS2024v1.0. https://hc.jsecs.org/irg/ws2024/app/?id=03067

Feedback above this line has already been reviewed during UTC #182 in January, 2025.

Date/Time: Tue Jan 21 12:24:59 CST 2025
ReportID: ID20250121122459
Name: Harriet Riddle
Report Type: Public Review Issue
Opt Subject: 508


The Unihan database for Unicode 16 contains the following line:

U+8280	kGB3	7215

However, character 72-15 in GB 7589 and 13131 is actually U+827B 艻, while U+8280 芀 appears in GB 7590 as character 71-30.

Thus, the correct data would be as follows:

U+827B	kGB3	7215
# No kGB3 for U+8280
U+8280	kGB5	7130

This does also imply that the IRG source G3-682F for U+8280 is incorrect and should be G5-673E (or perhaps GKX-1017.13 or GHZ-53173.10), but that would presumably need national body approval to change.

Date/Time: Wed Jan 29 19:02:11 CST 2025
ReportID: ID20250129190211
Name: Paul Masson
Report Type: Error Report
Opt Subject: kMandarin for U+9826 頦


The pronunciation in the database does not match that in my dictionaries, nor does it match the pronunciation of its simplified 
form U+988F 颏. This entry should be corrected to kē.

Date/Time: Thu Jan 30 15:19:42 CST 2025
ReportID: ID20250130151942
Name: Harriet Riddle
Report Type: Public Review Issue
Opt Subject: 508


Despite otherwise giving a comprehensive list of the dictionaries for which indices are provided, Section 3.3 ("Dictionary Indices") 
does not mention `kMorohashi` (although it also does not mention `kHanYu`, `kKangXi` or `kDaeJaweon`, it does mention `kIRGHanyuDaZidian`, 
`kIRGKangXi` and `kIRGDaeJaweon` respectively).

U+8FD6 appears with a `kIRG_KSource` value of `2-6557` in `Unihan-3.txt`[1] (1999) and `Unihan-3.1.txt`[2] (March 2001), but has no `kIRG_KSource` 
value in any subsequent version (not even a `KU-` source reference), starting with `Unihan-3.1.1.txt`[3] (June 2001). It does not appear in the 
final published version of KS X 1027-1 (2011, some ten years later), but the `0x6557` position is conspicuously skipped in what is otherwise a 
continuous allocated block within a 94×94 plane. If this has not already been done, it might be worth asking ROK to clarify whether this character 
was withdrawn from KS X 1027-1 intentionally or inadvertently.

A number of characters in CJK Extension A have sources prefixed with `G3` or `G5`, but no `kGB3` or `kGB5` property. This is unexpected considering 
that, for example, `kGB1` covers a strict superset of the `G1` source prefix (converted to row-cell format). `kGB3` and `kGB5` would similarly be 
expected to cover a superset of the `G3` and `G5` source prefixes converted to row-cell format, respectively. (In contrast to the former `kKSC1` property, 
`kGB3` and `kGB5` are not redundant in scope to the IRG source prefixes, since the sources in question (especially GB 7590 / 13132) include characters 
which have other G-source prefixes, providing considerable room for expansion of their coverage[4].)

The property description for `kJa` indicates that its scope comprises the characters that previously had a `JA-` source prefix. A total of 85 characters 
had their source reference changed from a `JA-` source to a JIS X 0213 source.[5] However, only seven characters have a `kJa` property.

[1] https://www.unicode.org/Public/3.0-Update/Unihan-3.txt

[2] https://www.unicode.org/Public/3.1-Update/Unihan-3.1.txt

[3] https://www.unicode.org/Public/3.1-Update1/Unihan-3.1.1.txt

[4] One example amongst many is https://hc.jsecs.org/irg/ws2024/app/index.php?id=00374#c1995 . I have compiled the data for the currently-encoded portion 
of the remainder of GB 13132 at https://github.com/harjitmoe/ecma35lib/blob/master/ecma35/data/multibyte/mbmaps/Custom/GB13132_additional.txt

[5] https://www.unicode.org/wg2/iso10646/edition5/data/JIS-X-0213-FromPrevious.txt


Date/Time: Sun Feb 02 07:39:09 CST 2025
ReportID: ID20250202073909
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: 508


Per Consensus 181-C26 from UTC Meeting #181, based on document IRG N2747 and Section 25 of document L2/24-227R, the representative glyphs of several G-source ideographs 
were updated or corrected for Unicode Version 17.0. What was overlooked in the handling of this consensus and associated action items was that the representative glyph 
change for U+2CD98 𬶘 in the Extension E block resulted in one fewer strokes in the non-radical component, 昂 with eight strokes, so its kRSUnicode property value should 
be changed from 195'.9 to 195'.8. Its kTotalStrokes property value, 16, does not need to change (the radical component is eight strokes).

Date/Time: Mon Feb 03 22:57:56 CST 2025
ReportID: ID20250203225756
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508


U+7A40 穀 currently has a kGB8 property of 9258. However, according to GB 8565.2—88 and ISO-IR-165:1992, the character placed at 92-58 is U+6996 榖.

Therefore, I propose the following changes to the Unihan database:

- U+7A40 穀
  - kGB8: 9258 -> (null)
- U+6996 榖
  - kGB8: (null) -> 9258

That is all.

Date/Time: Thu Feb 13 19:40:50 CST 2025
ReportID: ID20250213194050
Name: Paul Masson
Report Type: Error Report
Opt Subject: U+9F02 鼂 and U+9F0C 鼌


These two characters are related as traditional and simplified versions. The database currently does not indicate this relationship in either of their 
records. Please update the relevant variant fields.

Date/Time: Sun Feb 16 09:16:52 CST 2025
ReportID: ID20250216091652
Name: Harriet Riddle
Report Type: Public Review Issue
Opt Subject: 508


How are Jaemin Chung's observations in IRGN2479[1] impacted by the
reconstructed KPS 10721?[2] Or in other words, what is the relationship
between:

• KP0-E5A9 ⿰木巿 "시" in KPS 9566,[3]
• KP1-4ABB ⿰木巿 "시" in KPS 10721,[2]
• KP1-4B0C ⿰木市 "시" in KPS 10721,[2]
• U+676E 杮 "폐" in Unicode, and
• U+67FF 柿 "시" in Unicode?

In particular, does the fact that KP1-4ABB and KP1-4B0C seem to be
collated by the same pronunciation[4] (but different residual stroke
counts) make the observation in IRGN2479 that KP0-E5A9 is collated by
U+67FF's pronunciation moot? If so, does this imply that U+676E can be
either an independent character (폐, per its `kKorean` property value)
or a variant (`kSpecializedSemanticVariant`?) of U+67FF (시)? Also,
would any of this be worth documenting in UTN #50?

(Sidenote: it's interesting to note that U+676E seems to have been
included in IICore solely on account of having been equated with
KP0-E5A9.)

See also UCV #197, under which KP0-F2A5, KP1-50BD and KP1-510B, all
pronounced 패, all correspond to at least one reference glyph of
U+6C9B 沛.

[1] https://www.unicode.org/irg/docs/n2479-KP0-E5A9.pdf

[2] http://cheonhyeong.com/PDF/KP1-reconstitution.pdf

[3] https://www.unicode.org/irg/docs/n2783-ISO-IR-202.pdf

[4] KP1-4ABA is U+233DA 𣏚, i.e. 시. KP1-4ABC is U+6794 枔, i.e. 심.
KP1-4B0B is U+67F9 柹, i.e. 시. KP1-4B0D is U+67B2 枲, i.e. 시. Thus,
KP1-4ABB and KP1-4B0C are plausibly both 시, and neither is 폐 or 불.


Date/Time: Fri Feb 14 16:39:09 CST 2025
ReportID: ID20250214163909
Name: Nick C.
Report Type: Error Report
Opt Subject:


One of the strokes in U3153A is faulty, and has been that way since its introduction in the charts in Unicode 15.0.

Date/Time: Sun Feb 23 23:39:35 CST 2025
ReportID: ID20250223233935
Name: Edward Ng
Report Type: Error Report
Opt Subject: 2F936


Unihan Radical-Stroke Index --> Radical #102 (field) ⽥
Strokes  Character
   3        甾      code point 753e
   3        甾      code point 2f936 
   4        𱰭       code point 31c2d 
 
https://www.unicode.org/cgi-bin/UnihanRSIndex.pl?minstrokes=0&maxstrokes=100&submit=Submit&radical=102

https://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=2F936
https://en.glyphwiki.org/wiki/u2f936
Your browser: 甾

https://en.glyphwiki.org/wiki/u31c2d
Your browser: 𱰭

Question: 2F936 

Date/Time: Fri Feb 28 10:49:51 CST 2025
ReportID: ID20250228104951
Name: Judith Chen
Report Type: Public Review Issue
Opt Subject: 508


Currently, the following line appears in the USourceData.txt file:

- UTC-00369;Variant;U+69EA;75.11;;⿱既木;kMeyerWempe 1212;;;

However, the glyph for UTC-00369 is identical to U+3BA3 㮣. Therefore, I suggest horizontally extending U+3BA3 to add UTC-00369 
as a new source reference and updating the line in USourceData.txt as follows:

- UTC-00369;ExtA;U+3BA3;75.9;;⿱既木;kMeyerWempe 1212;;;

That is all.