Accumulated Feedback on PRI #323

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Wed Feb 24 15:21:38 CST 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: General category of U+11C38..11C3B is wrong

The general category of four Bhaiksuki vowels signs, E, AI, O, and AU, seems
wrong. It should be changed from Mc to Mn. From the examples on pp. 11-16 of
http://www.unicode.org/L2/L2014/14091-bhaiksuki.pdf, it's clear to me that
these are just potentially wide top-side vowels, and not left-and-top, right-
and-top, or left-right-and-top ones.

Date/Time: Wed Feb 24 15:32:27 CST 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: U+11CA9 MARCHEN SUBJOINED LETTER YA should have general category Mc

U+11CA9 MARCHEN SUBJOINED LETTER YA is a right-side mark, and should have a general 
category of Mc.

This is according to the proposal at http://www.unicode.org/L2/L2013/13197-marchen.pdf 
(see page 11 for the glyph and page 13 for the explanation).

Date/Time: Tue Mar 15 05:11:23 CDT 2016
Name: Åke Persson
Report Type: Public Review Issue
Opt Subject: PRI #323

The kHanyuPinyin mappings,release date: 2015-04-30 and related kMandarin 
corrections have not been applied to "Unihan_Readings.txt".

Date/Time: Sun Mar 20 22:07:17 CDT 2016
Name: Eiso Chan
Report Type: Error Report
Opt Subject: The Radical Of U+2C080

U+2C080 (𬂀) is the simplified form of U+81B6 (膶), but the radical of U+81B6
(膶) is #130 (肉) and the radical of U+2C080 (𬂀) is #74 (月). Unicode uses the
Kangxi Radicals, so the radical of U+2C080 (𬂀) should be #130 (肉). It will
become easy to search when the radical of U+2C080 (𬂀)  has been added on #130
(肉). Thx.

Date/Time: Sat Apr 2 18:25:42 CDT 2016
Name: Ken Lunde
Report Type: Error Report
Opt Subject: 21 missing kIRG_GSource entries

I discovered that the kIRG_GSource source is missing the following 21 entries,
all of which correspond to GB 16500-95 (aka GBK) and should thus use the "GE"
source prefix:

CJK Compatibility Ideographs: 9
U+F92C  kIRG_GSource  GE-FD9C
U+F979  kIRG_GSource  GE-FD9D
U+F995  kIRG_GSource  GE-FD9E
U+F9E7  kIRG_GSource  GE-FD9F
U+F9F1  kIRG_GSource  GE-FDA0
U+FA0C  kIRG_GSource  GE-FE40
U+FA0D  kIRG_GSource  GE-FE41
U+FA18  kIRG_GSource  GE-FE47
U+FA20  kIRG_GSource  GE-FE49

CJK Unified Ideographs: 12
U+FA0E  kIRG_GSource  GE-FE42
U+FA0F  kIRG_GSource  GE-FE43
U+FA11  kIRG_GSource  GE-FE44
U+FA13  kIRG_GSource  GE-FE45
U+FA14  kIRG_GSource  GE-FE46
U+FA1F  kIRG_GSource  GE-FE48
U+FA21  kIRG_GSource  GE-FE4A
U+FA23  kIRG_GSource  GE-FE4B
U+FA24  kIRG_GSource  GE-FE4C
U+FA27  kIRG_GSource  GE-FE4D
U+FA28  kIRG_GSource  GE-FE4E
U+FA29  kIRG_GSource  GE-FE4F

The same characters are in GB 18030 (kIRG_GSource source prefix "G9"), but
because GBK predates GB 18030 by five years, the "GE" source prefix seems more
appropriate.

This means a few additional things:

1) The code chart for the CJK Compatibility Ideographs block should reflect
both the 21 new kIRG_GSource source references, along with representative
glyphs that China should supply (a GB 18030 font could also be used).

2) China should consider a horizontal extension, which is the more formal way
of addressing #1 above.

3) Nine of these characters are CJK Compatibility Ideographs for which
Standardized Variants exist, and China's forthcoming revision of GB 18030
should reflect these as alternate representations (Hong Kong SAR's forthcoming
revision of Hong Kong SCS does this for the 14 such characters within its
scope):

5140 FE00; CJK COMPATIBILITY IDEOGRAPH-FA0C;
51C9 FE00; CJK COMPATIBILITY IDEOGRAPH-F979;
55C0 FE00; CJK COMPATIBILITY IDEOGRAPH-FA0D;
793C FE00; CJK COMPATIBILITY IDEOGRAPH-FA18;
79CA FE00; CJK COMPATIBILITY IDEOGRAPH-F995;
8612 FE00; CJK COMPATIBILITY IDEOGRAPH-FA20;
88CF FE00; CJK COMPATIBILITY IDEOGRAPH-F9E7;
90CE FE00; CJK COMPATIBILITY IDEOGRAPH-F92C;
96A3 FE00; CJK COMPATIBILITY IDEOGRAPH-F9F1;

I can take the action item to relay items #2 and #3 above to China.

(Note: The following two reports on Unihan have already been forwarded to the maintainers of the data.)

Date/Time: Mon Apr 4 07:03:39 CDT 2016
Name: zrh
Report Type: Error Report
Opt Subject: 拼音声调错误(×)1号

拼音声调错误(×)1号
#
# Unihan_Readings.txt
# Date: 2016-02-26 19:01:28 GMT [JHJ]
# Unicode version: 9.0.0
#
-------------------------------
Unihan-9.0.0d1\Unihan_Readings.txt
-------------------------------
U+69CC	kVietnamese	dùi(×)----->duì(√)
U+8968	kMandarin	dùi(×)----->duì(√)
U+8B75	kMandarin	dùi(×)----->duì(√)
U+9310	kVietnamese	dùi(×)----->duì(√)
U+237E9	kVietnamese	dùi(×)----->duì(√)
U+28B09	kVietnamese	dùi(×)----->duì(√)
U+28BF8	kVietnamese	dùi(×)----->duì(√)

Date/Time: Mon Apr 4 07:04:29 CDT 2016
Name: zrh
Report Type: Error Report
Opt Subject: 拼音声调错误(×)2号


拼音声调错误(×)2号
----------------------------------------
oū(×)----->ōu(√)
oú(×)----->óu(√)
oǔ(×)----->ǒu(√)
où(×)----->òu(√)
nǵ(×)----->ńg(√)
nǧ(×)----->ňg(√)
ng̀(×)----->ǹg(√)
----------------------------------------
U+4F18	kHanyuPinlu	yoū(287)
U+5077	kHanyuPinlu	toū(204)
U+512A	kHanyuPinlu	yoū(287)
U+515C	kHanyuPinlu	doū(43)
U+5256	kHanyuPinlu	poū(24)
U+52FE	kHanyuPinlu	goū(42)
U+5468	kHanyuPinlu	zhoū(396)
U+5DDE	kHanyuPinlu	zhoū(30)
U+5FE7	kHanyuPinlu	yoū(23)
U+60A0	kHanyuPinlu	yoū(14)
U+6182	kHanyuPinlu	yoū(23)
U+62BD	kHanyuPinlu	choū(244)
U+641C	kHanyuPinlu	soū(61)
U+6536	kHanyuPinlu	shoū(768)
U+6B27	kHanyuPinlu	oū(16)
U+6B50	kHanyuPinlu	oū(16)
U+6C9F	kHanyuPinlu	goū(136)
U+6D32	kHanyuPinlu	zhoū(203)
U+6E9D	kHanyuPinlu	goū(136)
U+7CA5	kHanyuPinlu	zhoū(48)
U+821F	kHanyuPinlu	zhoū(13)
U+8258	kHanyuPinlu	soū(30)
U+90FD	kHanyuPinlu	doū(5680) dū(93)
U+9264	kHanyuPinlu	goū(42)
U+94A9	kHanyuPinlu	goū(42)
U+9DD7	kHanyuPinlu	oū(13)
U+9E25	kHanyuPinlu	oū(13)
U+4EC7	kHanyuPinlu	choú(71)
U+5589	kHanyuPinlu	hoú(51)
U+5934	kHanyuPinlu	toú(3058) tou(715)
U+5C24	kHanyuPinlu	yoú(110)
U+6101	kHanyuPinlu	choú(48)
U+6295	kHanyuPinlu	toú(264)
U+63C9	kHanyuPinlu	roú(44)
U+67D4	kHanyuPinlu	roú(56)
U+697C	kHanyuPinlu	loú(179)
U+6A13	kHanyuPinlu	loú(179)
U+6CB9	kHanyuPinlu	yoú(822)
U+6E38	kHanyuPinlu	yoú(364)
U+72B9	kHanyuPinlu	yoú(57)
U+7334	kHanyuPinlu	hoú(86)
U+7336	kHanyuPinlu	yoú(57)
U+7531	kHanyuPinlu	yoú(1623)
U+7A20	kHanyuPinlu	choú(8)
U+7DA2	kHanyuPinlu	choú(24)
U+7EF8	kHanyuPinlu	choú(24)
U+8B00	kHanyuPinlu	moú(98) mou(24)
U+8C0B	kHanyuPinlu	moú(98) mou(24)
U+8E0C	kHanyuPinlu	choú(24)
U+8E8A	kHanyuPinlu	choú(24)
U+8EF8	kHanyuPinlu	zhoú(39)
U+8F74	kHanyuPinlu	zhoú(39)
U+90AE	kHanyuPinlu	yoú(32)
U+90F5	kHanyuPinlu	yoú(32)
U+923E	kHanyuPinlu	yoú(15)
U+94C0	kHanyuPinlu	yoú(15)
U+982D	kHanyuPinlu	toú(3058) tou(715)
U+4E11	kHanyuPinlu	choǔ(31)
U+4E45	kTang	*gioǔ gioǔ
U+4E46	kTang	gioǔ
U+4E5D	kTang	*gioǔ gioǔ
U+5076	kHanyuPinlu	oǔ(64)
U+53CB	kHanyuPinlu	you(437) yoǔ(275)
U+53E3	kHanyuPinlu	koǔ(1647) kou(74)
U+5426	kHanyuPinlu	foǔ(230)
U+543C	kHanyuPinlu	hoǔ(63)
U+5B88	kHanyuPinlu	shoǔ(218)
U+624B	kHanyuPinlu	shoǔ(2615) shou(12)
U+6296	kHanyuPinlu	doǔ(118)
U+6402	kHanyuPinlu	loǔ(25)
U+645F	kHanyuPinlu	loǔ(25)
U+6709	kHanyuPinlu	yoǔ(17798)
U+67D0	kHanyuPinlu	moǔ(294)
U+72D7	kHanyuPinlu	goǔ(216)
U+7785	kHanyuPinlu	choǔ(34)
U+86AA	kHanyuPinlu	doǔ(10)
U+8D70	kHanyuPinlu	zoǔ(3176)
U+9661	kHanyuPinlu	doǔ(31)
U+9996	kHanyuPinlu	shoǔ(492)
U+4F51	kHanyuPinlu	yoù(11)
U+5019	kHanyuPinlu	hou(2060) hoù(300)
U+517D	kHanyuPinlu	shoù(40)
U+51D1	kHanyuPinlu	coù(69)
U+539A	kHanyuPinlu	hoù(202)
U+53C8	kHanyuPinlu	yoù(4892)
U+53D7	kHanyuPinlu	shoù(1082)
U+53F3	kHanyuPinlu	yoù(490)
U+540E	kHanyuPinlu	hoù(4342)
U+5492	kHanyuPinlu	zhoù(11)
U+552E	kHanyuPinlu	shoù(78)
U+58FD	kHanyuPinlu	shoù(20)
U+591F	kHanyuPinlu	goù(991)
U+5920	kHanyuPinlu	goù(991)
U+594F	kHanyuPinlu	zoù(26)
U+5B99	kHanyuPinlu	zhoù(130)
U+5BC7	kHanyuPinlu	koù(19)
U+5BFF	kHanyuPinlu	shoù(20)
U+5E7C	kHanyuPinlu	yoù(66)
U+6263	kHanyuPinlu	koù(62)
U+6388	kHanyuPinlu	shoù(122)
U+63CD	kHanyuPinlu	zoù(25)
U+6597	kHanyuPinlu	doù(1360)
U+663C	kHanyuPinlu	zhoù(29)
U+665D	kHanyuPinlu	zhoù(29)
U+6784	kHanyuPinlu	goù(317)
U+69CB	kHanyuPinlu	goù(317)
U+6E4A	kHanyuPinlu	coù(69)
U+6F0F	kHanyuPinlu	loù(68)
U+7378	kHanyuPinlu	shoù(40)
U+7626	kHanyuPinlu	shoù(82)
U+76B1	kHanyuPinlu	zhoù(115)
U+76BA	kHanyuPinlu	zhoù(115)
U+8089	kHanyuPinlu	roù(278)
U+81ED	kHanyuPinlu	choù(72)
U+8A98	kHanyuPinlu	yoù(12)
U+8BF1	kHanyuPinlu	yoù(12)
U+8C46	kHanyuPinlu	doù(127)
U+8CFC	kHanyuPinlu	goù(41)
U+8D2D	kHanyuPinlu	goù(41)
U+900F	kHanyuPinlu	toù(316)
U+9017	kHanyuPinlu	doù(77)
U+964B	kHanyuPinlu	loù(9)
U+9A5F	kHanyuPinlu	zhoù(20)
U+9AA4	kHanyuPinlu	zhoù(20)
U+55EF	kHanyuPinlu	ń(48) ň(48) ǹ(48) nǵ(48) nǧ(48) ng̀(48)
U+55EF	kHanyuPinlu	ń(48) ň(48) ǹ(48) nǵ(48) nǧ(48) ng̀(48)
U+55EF	kHanyuPinlu	ń(48) ň(48) ǹ(48) nǵ(48) nǧ(48) ng̀(48)

Date/Time: Mon May 2 07:36:19 CDT 2016
Name: Charlotte Buff
Report Type: Feedback on an Encoding Proposal
Opt Subject: Invalid variation sequence in Unicode 9.0 repertoire

The upcoming Unicode 9.0 update defines a standardized variation sequence for
U+1031 MYANMAR VOWEL SIGN E using VS1, as was proposed in document L2/15-257
(http://www.unicode.org/L2/L2015/15257-khamti-disunify.pdf). However, U+1031
is a combining character (general category Mc) and section 23.4 clearly states
that "the base character in a variation sequence is never a combining
character or a canonical decomposable character". Therefore this proposed
variation sequence is invalid and shall not be added to the Unicode standard.

Date/Time: Sat May 7 07:53:40 CDT 2016
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: XML UCD 9.0 Data for Tag Characters

I'm trying to validate the XML UCD data for Unicode 9.0 beta (XML data dated
14-Apr-2016), and I'm having problems with the Tag characters.

According to TR44:

Grapheme_Extend = Me + Mn + Other_Grapheme_Extend

**Note: The set of characters for which Grapheme_Extend=Yes is equivalent 
to the set of characters for which Grapheme_Cluster_Break=Extend.**
 
However, Tag characters E0020..E007F are Cf and not Other_Grapheme_Extend; but
they are Grapheme_Cluster_Break=Extend; so they should also be
Grapheme_Extend.

The XML UCD data is inconsistent:

<char cp="E0020" ... Gr_Ext="N" OGr_Ext="N" ... GCB="EX" WB="Extend" SB="FO" ... />

I think that OGr_Ext="N", GCB="EX", and WB="Extend" are correct, 
but Gr_Ext="N" should be Gr_Ext="Y", and SB="FO" should be SB="EX".

If this is so, the definition of Format in Table 4 of the proposed TR29 
update needs correcting.

If XML UCD data is correct and I have misunderstood something in the data 
or TR29 and TR44 please let me know where I have gone wrong.

Thanks,

Andrew

Date/Time: Thu May 12 13:52:44 CDT 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: Indic Syllabic Category of Khamti logograms U+AA74..U+AA76

Based on comments I received from Martin Hosken through Behdad Esfahbod (see
https://github.com/roozbehp/unicode-data/issues/3), the three Khamti logograms
take tone marks, so they should have an Indic Syllabic Category.

Here is the information from the original proposal, at
http://www.unicode.org/L2/L2008/08276-khamti-proposal.pdf:

"Three logogram characters are also used which can take tone and whose meaning
is according to the tone they take. They are used when transcribing speech
rather than in formal writing. For example, ˀn takes three tones and means:
ꩵႈ negative, ꩵႉ giving and ꩵး yes. hm also takes three different tones and
means: ꩶႚ part of no (prefixed by hm negative), ꩶႊ question response marker,
ꩶး there. Oay takes two tones and is used when addressing a loved one ꩴႊ or
someone far away ꩴး."

Based on the information, I believe we need to give the character an Indic
Syllabic Category of either Consonant or Consonant_Placeholder. It appears to
me that Consonant_Placeholder may be a better class, similar to U+104E MYANMAR
SYMBOL AFOREMENTIONED.