Public Review Issues

Accumulated Feedback on PRI #433

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Tue Jun 8 11:08:50 CDT 2021
Name: Mike FABIAN
Report Type: Error Report
Opt Subject: Bugs in Unihan_Variants.txt (Unicode 14 draft version)

I think Unihan_Variants.txt from

https://www.unicode.org/Public/14.0.0/ucd/Unihan-14.0.0d4.zip 

still has bugs for the following characters:

乾 U+4E7E, 著 U+8457, 覆 U+8986

are used not only use traditional Chinese but also in simplified Chinese,

杰 U+6770, 系 U+7CFB, 只 U+53EA

are used not only in simplified Chinese but also in traditional Chinese.

The Chinese input method ibus-table (which I maintain) has options to
show only simplified, only traditional, simplified first, traditional
first, or both in no particular order.

To figure out which characters are simplified, traditional, or both,
ibus-table uses Unihan_Variants.txt.

Therefore, I still apply these two patches to Unihan_Variants.txt:

https://github.com/mike-fabian/ibus-table/commit/c37452a7bf49ccfd0f2062a3e90085022cb3735b 
https://github.com/mike-fabian/ibus-table/commit/6384e217f7283d12b1bdabbc10da4d2cf2e4f94a 

$ git show c37452a7bf49ccfd0f2062a3e90085022cb3735b
commit c37452a7bf49ccfd0f2062a3e90085022cb3735b
Author: Mike FABIAN 
Date:   Tue Jun 8 13:58:22 2021 +0200

    Keep our fixes to Unihan_Variants which are not yet included upstream

diff --git a/tools/Unihan_Variants.txt b/tools/Unihan_Variants.txt
index 72be50f..a29e627 100644
--- a/tools/Unihan_Variants.txt
+++ b/tools/Unihan_Variants.txt
@@ -1015,7 +1015,7 @@ U+4E66	kTraditionalVariant	U+66F8
 U+4E70	kTraditionalVariant	U+8CB7
 U+4E71	kSemanticVariant	U+4E82<kMatthews,kMeyerWempe
 U+4E71	kTraditionalVariant	U+4E82
-U+4E7E	kSimplifiedVariant	U+5E72
+U+4E7E	kSimplifiedVariant	U+4E7E U+5E72
 U+4E7E	kSpecializedSemanticVariant	U+4E81<kFenn
 U+4E81	kSpecializedSemanticVariant	U+4E7E<kFenn
 U+4E82	kSemanticVariant	U+4E71<kMatthews,kMeyerWempe
@@ -3814,7 +3814,7 @@ U+6769	kTraditionalVariant	U+69AA
 U+676E	kSpoofingVariant	U+67FF
 U+676F	kSemanticVariant	U+76C3<kLau,kMatthews,kMeyerWempe
 U+6770	kSemanticVariant	U+5091<kMatthews
-U+6770	kTraditionalVariant	U+5091
+U+6770	kTraditionalVariant	U+6770 U+5091
 U+6771	kSemanticVariant	U+4E1C<kFenn
 U+6771	kSimplifiedVariant	U+4E1C
 U+6777	kSemanticVariant	U+6733<kMatthews
@@ -5912,7 +5912,7 @@ U+7CF9	kSemanticVariant	U+7CF8<kMatthews
 U+7CF9	kSimplifiedVariant	U+7E9F
 U+7CFA	kSemanticVariant	U+7CFE<kMatthews,kMeyerWempe
 U+7CFA	kSimplifiedVariant	U+2B119
-U+7CFB	kTraditionalVariant	U+4FC2 U+7E6B
+U+7CFB	kTraditionalVariant	U+7CFB U+4FC2 U+7E6B
 U+7CFD	kSimplifiedVariant	U+30AFC
 U+7CFE	kSemanticVariant	U+7CFA<kMatthews,kMeyerWempe
 U+7CFE	kSimplifiedVariant	U+7EA0
@@ -6938,7 +6938,7 @@ U+8441	kSemanticVariant	U+8591<kMatthews
 U+8449	kSimplifiedVariant	U+53F6
 U+8452	kSimplifiedVariant	U+836D
 U+8457	kSemanticVariant	U+7740
-U+8457	kSimplifiedVariant	U+7740
+U+8457	kSimplifiedVariant	U+8457 U+7740
 U+8457	kSpecializedSemanticVariant	U+7740<kFenn U+87AB<kFenn
 U+845A	kSemanticVariant	U+6939<kFenn
 U+845D	kSimplifiedVariant	U+2B20E
@@ -7475,7 +7475,7 @@ U+8975	kSimplifiedVariant	U+2B307
 U+8978	kSimplifiedVariant	U+2C877
 U+8979	kSimplifiedVariant	U+30CFC
 U+897C	kSimplifiedVariant	U+30CF5
-U+8986	kSimplifiedVariant	U+590D
+U+8986	kSimplifiedVariant	U+590D U+8986
 U+8987	kSemanticVariant	U+9738<kMeyerWempe
 U+898A	kSemanticVariant	U+7F88<kMatthews
 U+898B	kSimplifiedVariant	U+89C1
$


$ git show 6384e217f7283d12b1bdabbc10da4d2cf2e4f94a
commit 6384e217f7283d12b1bdabbc10da4d2cf2e4f94a
Author: Mike FABIAN 
Date:   Tue Jun 8 13:59:59 2021 +0200

    Fix bug in Unihan_Variants.txt, 只 U+53EA is both simplified *and* traditional Chinese
    
    Resolves: https://github.com/kaio/ibus-table/issues/74 

diff --git a/engine/chinese_variants.py b/engine/chinese_variants.py
index e53e6d6..0b79eea 100644
--- a/engine/chinese_variants.py
+++ b/engine/chinese_variants.py
@@ -1116,7 +1116,7 @@ VARIANTS_TABLE = {
     u'叙': 1,
     u'叠': 1,
     u'叢': 2,
-    u'只': 1,
+    u'只': 3,
     u'台': 3,
     u'叶': 1,
     u'号': 1,
diff --git a/tools/Unihan_Variants.txt b/tools/Unihan_Variants.txt
index a29e627..ab84494 100644
--- a/tools/Unihan_Variants.txt
+++ b/tools/Unihan_Variants.txt
@@ -1695,7 +1695,7 @@ U+53E2	kSemanticVariant	U+6A37<kFenn
 U+53E2	kSimplifiedVariant	U+4E1B
 U+53E8	kSemanticVariant	U+9955<kMeyerWempe
 U+53EA	kSemanticVariant	U+5B50<kLau
-U+53EA	kTraditionalVariant	U+96BB
+U+53EA	kTraditionalVariant	U+53EA U+96BB
 U+53EB	kSemanticVariant	U+544C<kLau,kMatthews,kMeyerWempe
 U+53F0	kSemanticVariant	U+81FA<kHKGlyph,kLau
 U+53F0	kSimplifiedVariant	U+53F0
diff --git a/tools/generate-chinese-variants.py b/tools/generate-chinese-variants.py
index 391b5ef..310d6ae 100755
--- a/tools/generate-chinese-variants.py
+++ b/tools/generate-chinese-variants.py
@@ -273,6 +273,7 @@ TEST_DATA = {
     u'系': 3, # U+7CFB
     u'乾': 3, # U+4E7E
     u'著': 3, # U+8457 Patch by Heiher <r@hev.cc>
+    u'只': 3, # U+53EA, see: https://github.com/kaio/ibus-table/issues/74 
     }
 
 def test_detection(generated_script) -> int:
$

Date/Time: Tue Jun 8 23:34:03 CDT 2021
Name: Yi Bai
Report Type: Error Report
Opt Subject: Glyph missing in code chart of CJK Unified Ideographs Extension A

Note: This glyph error will be corrected by the chart editors and will be posted.

In code chart of CJK Unified Ideographs Extension A in Version 14.0 Beta,
glyph of 4DB9 with UTC source UTC-00120 is missing. The glyph can be found
in current 13.0 code chart.

Please update the code chart accordingly, thank you.

Date/Time: Thu Jun 10 09:18:03 CDT 2021
Name: Andrew Christopher West
Report Type: Public Review Issue
Opt Subject: PRI #433 Unicode 14.0.0 Beta

U+2B8D9 and U+2B8DA have the wrong radical and stroke count since the glyph
changes for Unicode 13.0 simplified 來 to 来 in both cases (in retrospect
this was a destabilizing glyph change, and encoding two new characters
would have been a better solution). The code charts for Unicode 14.0 still
show the old radical stroke count of 9.12, but as the new glyph forms no
longer include the 'person' radical (9) they must be assigned to a new
radical. I suggest 'wood' (75) as this is the radical for 来, which would
give 75.9 for both characters.

Date/Time: Thu Jun 10 09:48:33 CDT 2021
Name: Andrew Christopher West
Report Type: Public Review Issue
Opt Subject: PRI #433 Unicode 14.0.0 Beta

Following glyph changes in Unicode 11.0, the following characters have the wrong radical and stroke counts:
U+2B1CD: 132.20 -- should be 132.19
U+2B584: 176.12 -- should be 176.11
U+2B8DE: 9.12 -- should be 9.11
U+2C7C3: 140.15 -- should be 140.14

Following glyph changes in Unicode 13.0, the following characters have the wrong radical and stroke counts:
U+2BD61: 44.10 -- should be 44.9
U+2BE4A: 59.14 -- should be 59.15
U+2BF9D: 64.18 -- should be 64.19
U+2C0B8: 75.7 -- should be 75.8
U+2C142: 75.15 -- should be 75.16
U+2C316: 91.17 -- should be 91.19
U+2C83A: 142.17 -- should be 142.18
U+2CC88: 182'.13 -- should be 182'.16

Date/Time: Fri Jun 11 12:50:02 CDT 2021
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: PRI #433: Other_Lowercase Property for New Modifier Letters

The following characters in the new Latin Extended-F block should be given
the property Other_Lowercase=True for consistency with other similar
modifier letters already encoded:

	U+10780 (MODIFIER LETTER SMALL CAPITAL AA)
	U+10783..U+10785 (MODIFIER LETTER SMALL AE..MODIFIER LETTER SMALL B WITH HOOK)
	U+10787..U+107B0 (MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK)
	U+107B2..U+107B5 (MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER BILABIAL CLICK)
	U+107BA (MODIFIER LETTER SMALL S WITH CURL)

It is unclear whether the following characters should be classified as
lowercase as well since their base forms are Other_Letter rather than
Lowercase_Letter:

	U+10781..U+10782 (MODIFIER LETTER SUPERSCRIPT TRIANGULAR COLON..MODIFIER LETTER SUPERSCRIPT HALF TRIANGULAR COLON)
	U+107B6..U+107B9 (MODIFIER LETTER DENTAL CLICK..MODIFIER LETTER RETROFLEX CLICK WITH RETROFLEX HOOK)

Date/Time: Mon Jun 14 09:03:19 CDT 2021
Name: Mike FABIAN
Report Type: Error Report
Opt Subject: Some more possible bugs in Unihan_Variants.txt


Recently I reported some problems in Unihan_Variants.txt from 

https://www.unicode.org/Public/14.0.0/ucd/Unihan-14.0.0d4.zip 

I received this bug report about ibus-table classifying some characters
wrongly as simplified only or traditional only:

https://github.com/ibus/ibus/issues/2323 

According to this bug report there maybe a few more bugs concerning
kTraditionalVariant and kSimplifiedVariant.

In the above bug report, the user said that

着 U+7740 is used in Hong Kong, 云 U+4E91 is used both in Hong Kong and Taiwan,
裡 U+88E1 and 復 U+5FA9 are used everywhere (i.e. also in simplified Chinese),
采 U+91C7 is used in Hong Kong (The user wrote he was nor sure about Taiwan,
but probably it iis used in Taiwan as well as it is listed in
http://dict.revised.moe.edu.tw/cgi-bin/cbdic/gsweb.cgi ),
吓 U+5413 is used in Cantonese, 揾 U+63FE is used in Hong Kong.

The user wrote he doesn’t know where 尸 is used but it is one of the
radicals used on a Cangjie keyboard, so it seems to be used in
traditonal Chinese, at least as a radical.

I fixed it like this:

https://github.com/mike-fabian/ibus-table/commit/5ed1cc16b398e0161e63ef35421d94166caf56c0 

I.e. I applied the following changes to Unihan_Variants.txt:

-U+4E91 kTraditionalVariant     U+96F2
+U+4E91 kTraditionalVariant     U+4E91 U+96F2

-U+5413 kTraditionalVariant     U+5687
+U+5413 kTraditionalVariant     U+5413 U+5687

-U+5C38 kTraditionalVariant     U+5C4D
+U+5C38 kTraditionalVariant     U+5C38 U+5C4D

-U+5FA9 kSimplifiedVariant      U+590D
+U+5FA9 kSimplifiedVariant      U+590D U+5FA9

-U+63FE kTraditionalVariant     U+6435
+U+63FE kTraditionalVariant     U+63FE U+6435

-U+7740 kTraditionalVariant     U+8457
+U+7740 kTraditionalVariant     U+7740 U+8457

-U+88E1 kSimplifiedVariant      U+91CC
+U+88E1 kSimplifiedVariant      U+88E1 U+91CC

-U+91C7 kTraditionalVariant     U+57F0 U+63A1
+U+91C7 kTraditionalVariant     U+57F0 U+63A1 U+91C7

Date/Time: Tue Jun 15 06:29:07 CDT 2021
Name: M
Report Type: Other Question, Problem, or Feedback
Opt Subject: FEEDBACK ABOUT UNIHAN DATABASE

1. feedback:
[乹][U+4E7E] is a variant of [乾][U+4E7E], [亁][U+4E81] according to the variant list 
(第一批异体字整理表) https://upload.wikimedia.org/wikipedia/commons/2/29/%E7%AC%AC%E4%B8%80%E6%89%B9%E5%BC%82%E4%BD%93%E5%AD%97%E6%95%B4%E7%90%86%E8%A1%A8.pdf 
page 4

2. question:
[卿][U+537F]
kRSUnicode	26.9
kTotalStrokes	10

This doesn't make sense.

SHOULD BE
kRSUnicode	26.8
kTotalStrokes	10
OR
kRSUnicode	26.9
kTotalStrokes	11

Date/Time: Thu Jun 17 07:59:42 CDT 2021
Name: Ken Lunde
Report Type: Error Report
Opt Subject: 21 CJK Unified Ideographs are missing kTotalStrokes property values

The following 21 CJK Unified Ideographs, in the range U+9FD6 through U+9FEA that 
were adding in Unicode Version 10.0, are missing kTotalStrokes property values, 
and the suggested property values are provided below:

U+9FD6 	kTotalStrokes	7
U+9FD7 	kTotalStrokes	9
U+9FD8 	kTotalStrokes	12
U+9FD9 	kTotalStrokes	13
U+9FDA 	kTotalStrokes	16
U+9FDB 	kTotalStrokes	24
U+9FDC 	kTotalStrokes	22
U+9FDD 	kTotalStrokes	27
U+9FDE 	kTotalStrokes	20
U+9FDF 	kTotalStrokes	12
U+9FE0 	kTotalStrokes	21
U+9FE1 	kTotalStrokes	33
U+9FE2 	kTotalStrokes	14
U+9FE3 	kTotalStrokes	15
U+9FE4 	kTotalStrokes	18
U+9FE5 	kTotalStrokes	22
U+9FE6 	kTotalStrokes	23
U+9FE7 	kTotalStrokes	25
U+9FE8 	kTotalStrokes	27
U+9FE9 	kTotalStrokes	29
U+9FEA 	kTotalStrokes	17

Date/Time: Fri Jun 18 01:50:26 CDT 2021
Name: Lim Hian-tong
Report Type: Public Review Issue
Opt Subject: Issues related to Kana Extended-B (Public Review Issue #433)

This is a feedback on Unicode 14.0.0 Beta. I refer to Public Review
Issue #433.

I am writing to request amendments of the code chart for Kana Extended-B, as
shown in the current beta draft of the Unicode Standard, Version 14.0.

The descriptions of U+1AFF0 and U+1AFF8 (“also used for tone six”) should be
removed for the following reasons.

The current descriptions come from document L2/20-209R
(titled “Final proposal to encode Taiwanese kana in the UCS”) by Fredrick
R. Brennan. In the document, the sentence stating “In modern Hokkien, tone
six is equal to tone two” is inconsistent with the source text
(Chiung), which says “It has been observed that tone 6 had merged with tone
2 or tone 7” instead. What’s more, what Chiung has written is also a
misquote from Ang Ui-jin’s “The tonal study of Taiwanese,” which compares
Minnan tones with Middle Chinese ones. Ang does not claim that “tone 6
(of the Minnan language) has merged with tone 2 or tone 7.”

The complicated tone situation originates from the fact that Hokkien
consists of two major dialects, namely Quanzhou and Zhangzhou, with
different phonologies. The two major dialects, along with dialects
descended from them, are spoken in the PRC, in Taiwan and across Southeast
Asia. Quanzhou speakers make a clear distinction between tones 6 and 7
(up until today), while Zhangzhou speakers merge them and assign the merged
tone as “tone 7,” removing “tone 6” from their phonology. Every single word
with tone 6 in Quanzhou is pronounced with tone 7 in Zhangzhou due to the
merge. This can be observed on the Facebook page “Taigikho,” where all
words with tone 6 (marked with a caron) are given as variants of tone 7
(marked with a macron). Detailed explanations of Quanzhou phonology can be
found in various publications in the PRC, including dictionaries,
chorographies and periodicals.

However, certain scholars who only speak the Zhangzhou dialect would attempt
to fill the gap of the non-existent “tone 6” with what they guess
the “assumed historical tone 6” was in the past. This has resulted in a
misconception among folks that the “historical tone 6” in Zhangzhou somehow
became tone 2. Since the Quanzhou tonal system is less common and is not
widely studied in Taiwan, Taiwanese publications tend to adopt the
linguistically incorrect theory. Aside from simply giving the idea, these
publications are not able to list actual examples to support the
misconception because of the impossibility to do so.

Of the printed materials in Taiwan, only scattered studies involving
Quanzhou phonology contain correct linguistic information regarding tones.
Correct descriptions and usage of tone 6 can also be found in the
official “Dictionary of Frequently-Used Taiwan Minnan” by the ROC’s
Ministry of Education and other online resources by Taiwanese researchers
who are familiar with dialectal differences.

In short, tone 2 has absolutely nothing to do with tone 6. “Also used for
tone six” should be considered an inappropriate description for
both “katakana letter Minnan tone-2” and “katakana letter Minnan nasalized
tone-2.” By removing such descriptions, the Unicode documentation will be
much less likely to cause confusions, misunderstandings and disputes.

Date/Time: Fri Jun 18 15:49:09 CDT 2021
Name: Paul Masson
Report Type: Error Report
Opt Subject: kPhonetic for U+96B1

This character appears in group 1483 on p.152 of Casey. The field 
is missing in the database and needs to be added.

Date/Time: Sat Jun 26 09:07:52 CDT 2021
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: PRI #433 (Unicode Version 14.0.0 Beta) feedback

The kIRG_VSource property value for U+20307 𠌇 should be changed from 
V4-4131 to VN-20307. IRG Working Set 2017 (aka Extension H) serial number 
00138 will use V4-4131 as its kIRG_VSource property value, and Vietnam 
confirmed that it is correct, and that the kIRG_VSource property value 
for U+20307 𠌇 should be changed to VN-20307. See:

https://hc.jsecs.org/irg/ws2017/app/index.php?id=00138

Date/Time: Sun Jun 27 05:59:21 CDT 2021
Name: M
Report Type: Other Question, Problem, or Feedback
Opt Subject: Suggestion for Unihan Database


You can add "stroke order" information to Unihan Database, e.g. kStroke.
Represented by numbers 1-5, 5 basic strokes in Chinese. It is called 笔顺号码
(stroke order numbers) in Chinese.

㇐:㇑:㇒:㇏:㇖
横:竖:撇:捺:折
héng:shù:piē:nà:zhé
1:2:3:4:5

1: 横 >> ㇐㇀ (from left to right, from bottom left  to top right)
2: 竖 >> ㇑㇚ (from top to bottom)
3: 撇 >> ㇒㇓ (from top right  to bottom left)
4: 捺 >> ㇏㇝㇔ (from top left to bottom right)
5: 折 >> ㇖㇠㇡㇕㇇㇄㇂ (anything else that folds except ㇚ which is included in the 2nd type of stoke)

Let's pick U+7B14 笔 [bǐ] as an example:

It's made of ㇒㇐㇔㇒㇐㇔㇒㇐㇐㇟ [撇横捺撇横捺撇横横折] in stroke order.
So the kStroke value of this character would be 3143143115.

Probably you must know all this.
This would help find characters by stroke order quickly.

Thanks

Date/Time: Mon Jun 28 18:50:05 CDT 2021
Name: Peter Constable [MSFT]
Report Type: Public Review Issue
Opt Subject: Emoji 14 counts

The counts for new emoji in v14.0 have some inconsistencies:

1) The table at the bottom of the Emoji Recently Added, v14.0β page
(https://www.unicode.org/emoji/charts-14.0/emoji-released.html) indicates
37 new emoji characters, which matches what is stated on the BETA Unicode
14.0.0 page (https://www.unicode.org/versions/beta-14.0.0.html), and the
draft Unicode 14.0.0 summary page
(https://www.unicode.org/versions/Unicode14.0.0/). However, there are 38
rows on the Emoji Recently Added page for non-zwg-sequence emoji. 

The discrepancy appears to be that row 15 is listing U+1F91D (and five
corresponding modifier sequences), which is not new to v14, but rather was
added in Emoji v3.0 / Unicode 9.0.


2) The table at the bottom of the Emoji Recently Added page cites 55
(non-zwj) emoji modifier sequences and a total count of new emoji / emoji
sequences of 112. However, the emoji-test.txt data file at the
https://www.unicode.org/Public/emoji/14.0 cites only 107 items for v14.0,
and the emoji-sequences.txt cites only 55 (non-zwj) emoji modifier
sequences. 

This discrepancy may be due to modifier sequences for 1F91D being
incorrectly counted as part of v14.0: there are five modifier sequence
entries for 1F91D in the emoji-sequences.txt file cited as being from Emoji
v3.0.

Date/Time: Mon Jun 28 16:34:26 CDT 2021
Name: Peter Constable [MSFT]
Report Type: Public Review Issue
Opt Subject: Emoji 14 beta

Unicode and UTC do a decent job during the beta for a new Unicode edition of
helping reviewers see what new characters are being added to the next
version. For example, one can readily browse through the following trail:

Open PRIs: https://www.unicode.org/review/ 
PRI 433, Unicode 14 beta: https://www.unicode.org/review/pri433/ 
Unicode 14 beta: https://www.unicode.org/versions/beta-14.0.0.html 
Unicode 14 summary: https://www.unicode.org/versions/Unicode14.0.0/ 
delta code charts: https://www.unicode.org/charts/PDF/Unicode-14.0/ 

For emoji additions (atomic characters or RGI sequences), it's much harder
to find similar delta information. 

The Unicode 14 summary page has a link to the Emoji Counts page
(https://www.unicode.org/emoji/charts-14.0/emoji-counts.html), but this is
cumulative to the latest (beta) version, not a delta.

The Unicode 14 summary page also has a link to the Emoji Recently Added,
v14.0β page
(https://unicode.org/emoji/charts-14.0/emoji-released.html), which appears
to be the delta in question, though (at least at first glance) it is
unclear how to interpret some of the information. In particular:

- The Unicode 14 summary page cites 37 new emoji, and the table at the
  bottom of the Emoji Recently Added page cites 37 atomic emoji characters,
  yet the preceding chart has 39 rows. A trained eye might notice that row
  16 has a ZWJ sequence, but that still leaves 38 other rows that appear to
  list atomic characters. It's unclear if the count of "37" is off, or if
  this page is listing emoji additions beyond 14.0, or some other issue.
  (Given that rows 15 and 16 both list a short name "handshake", one might
  guess that is the source of the extra count, except that row 16 is the
  ZWJ sequence, so already accounted for.)

(Some other issues with the Emoji Recently Added page: Unlike delta code
charts for The Unicode Standard, this page gives candidate "reference
numbers" Xnnnnn, not code points. Also, the sample images in rows 15 and 16
appear to be wrong.)

One might happen upon the Emoji List, v14.0 page
(https://www.unicode.org/emoji/charts-14.0/emoji-list.html) and search for
occurrences of "⊛" to find recent additions, but there are 38 instances,
not 37. Since "recently-added" is not exactly the same as "added in v.14",
things are still unclear. 

Or, one might happen upon the Draft Emoji Candidates page
(https://www.unicode.org/emoji/future/emoji-candidates.html), which appears
to have the same list of emoji as in the Emoji Recently Added page, but
doesn't mention v14.0 at all.

Turning to other sources, one could go to PUTS #51
(https://www.unicode.org/reports/tr51/tr51-20.html) for find the emoji data
and, after correcting for the fact that the ".../latest/..." URL lands at
v13.1 data, navigate to the emoji 14.0 data folder
(https://www.unicode.org/Public/emoji/14.0/), then look in the data files
for E14.0 additions. The emoji-zwj-sequences.txt file has 20 occurrences
of "E14.0", which matches the count in the table at the bottom of the Emoji
Recently Added page; but the emoji-sequences.txt file has 61 occurrences
of "E14.0", which doesn't correspond to counts given elsewhere; of course,
that's because some rows have ranges, and if one adds up the range counts
for Basic_Emoji, that does add up to 37. But this isn't the easiest way to
point reviewers to the new emoji sequences for 14.0, and the 37 vs. 38
discrepancy remains.

Date/Time: Tue Jun 29 21:21:24 CDT 2021
Name: Ryusei Yamaguchi
Report Type: Public Review Issue
Opt Subject: PRI #433 Unicode 14.0.0 Beta


Unihan_IRGSources.txt from https://www.unicode.org/Public/14.0.0/ucd/Unihan-14.0.0d4.zip  
has some bugs: the kTotalStrokes for following characters don't have exact stroke counts.

UCS,char,current,correct
U+21FE8,𡿨,3,1 
U+248E5,𤣥,5,4
U+2634D,𦍍,6,5
U+264D0,𦓐,6,5
U+26612,𦘒,6,5
U+27607,𧘇,5,4
U+2795B,𧥛,7,6
U+2795C,𧥜,7,6
U+27C27,𧰧,7,6
U+27C28,𧰨,7,6
U+28210,𨈐,7,5
U+28211,𨈑,7,6

Date/Time: Fri Jul 2 03:17:10 CDT 2021
Name: M
Report Type: Other Question, Problem, or Feedback
Opt Subject: Missing Data in Unihan Databse

Found some missing simplification data in Unihan_Variants

U+44D6	kTraditionalVariant	U+85ED
U+4E86	kTraditionalVariant	U+77AD
U+4F19	kTraditionalVariant	U+5925
U+501F	kTraditionalVariant	U+85C9
U+51AC	kTraditionalVariant	U+9F15
U+5343	kTraditionalVariant	U+97C6
U+535C	kTraditionalVariant	U+8514
U+5377	kTraditionalVariant	U+6372
U+5401	kTraditionalVariant	U+7C72
U+5408	kTraditionalVariant	U+95A4
U+56DE	kTraditionalVariant	U+8FF4
U+59DC	kTraditionalVariant	U+8591
U+5BB6	kTraditionalVariant	U+50A2
U+5CC3	kTraditionalVariant	U+5DA8
U+5EBC	kTraditionalVariant	U+5ECE
U+624D	kTraditionalVariant	U+7E94
U+6298	kTraditionalVariant	U+647A
U+65CB	kTraditionalVariant	U+93C7
U+6731	kTraditionalVariant	U+7843
U+7076	kTraditionalVariant	U+7AC8
U+79CB	kTraditionalVariant	U+97A6
U+8499	kTraditionalVariant	U+61DE U+6FDB  U+77C7
U+8511	kTraditionalVariant	U+884A
U+9709	kTraditionalVariant	U+9EF4

U+85ED	kSimplifiedVariant	U+44D6
U+77AD	kSimplifiedVariant	U+4E86
U+5925	kSimplifiedVariant	U+4F19
U+85C9	kSimplifiedVariant	U+501F
U+9F15	kSimplifiedVariant	U+51AC
U+97C6	kSimplifiedVariant	U+5343
U+8514	kSimplifiedVariant	U+535C
U+6372	kSimplifiedVariant	U+5377
U+7C72	kSimplifiedVariant	U+5401
U+95A4	kSimplifiedVariant	U+5408
U+8FF4	kSimplifiedVariant	U+56DE
U+8591	kSimplifiedVariant	U+59DC
U+50A2	kSimplifiedVariant	U+5BB6
U+5DA8	kSimplifiedVariant	U+5CC3
U+5ECE	kSimplifiedVariant	U+5EBC
U+7E94	kSimplifiedVariant	U+624D
U+647A	kSimplifiedVariant	U+6298
U+93C7	kSimplifiedVariant	U+65CB
U+7843	kSimplifiedVariant	U+6731
U+7AC8	kSimplifiedVariant	U+7076
U+97A6	kSimplifiedVariant	U+79CB
U+61DE	kSimplifiedVariant	U+8499
U+6FDB	kSimplifiedVariant	U+8499
U+77C7	kSimplifiedVariant	U+8499
U+884A	kSimplifiedVariant	U+8511
U+9EF4	kSimplifiedVariant	U+9709

Date: Mon, 5 Jul 2021 12:19:48 -0400
Name: Daniel Yacob
Subject: 3 Name Defects in Ethiopic Extended-B Tables

I was just working with the table for the Ethiopic Extended-B range,
published under the U14 Beta delta listing here:
https://www.unicode.org/charts/PDF/Unicode-14.0/ 

I found that a few names were off, I believe the error originates from the
UniBook output that I submitted earlier this year.  The defects are:

1E7E9 ETHIOPIC SYLLABLE HWI
1E7EA ETHIOPIC SYLLABLE HWEE
1E7EB ETHIOPIC SYLLABLE HWE

In each case the name base "H" should have been "HH", the corrected names:

1E7E9 ETHIOPIC SYLLABLE HHWI
1E7EA ETHIOPIC SYLLABLE HHWEE
1E7EB ETHIOPIC SYLLABLE HHWE

I apologize for this.  The names are correct in our proposal L2/21-037
(https://www.unicode.org/L2/L2021/21037-gurage-adds.pdf) and I think the
difference simply stems from a typographical error that I made while
working with UniBook.

thank you,

-Daniel

Date/Time: Thu Jul 8 20:01:46 CDT 2021
Name: philip r brenan
Report Type: Error Report
Opt Subject: ORNATE LEFT PARENTHESIS should be Ps ?

FD3E;ORNATE LEFT PARENTHESIS;Pe;0;ON;;;;;N;;;;;
FD3F;ORNATE RIGHT PARENTHESIS;Ps;0;ON;;;;;N;;;;;

Possibly the Pe and Ps are the wrong way around?

Date/Time: Fri Jul 9 18:46:29 CDT 2021
Name: Martin J. Dürst
Report Type: Error Report
Opt Subject: Data files: Emoji Version Mismatch


[This talks about version 13.0, but is very relevant to version 14.0 (now in
beta), too.]

I'm currently working on updating Ruby from Emoji 13.0 to Emoji 13.1
(see https://bugs.ruby-lang.org/issues/18029).

That works for the files in https://www.unicode.org/Public/emoji/13.1/,
which all say they are for version 13.1. But it doesn't work for the files
moved to https://www.unicode.org/Public/13.0.0/ucd/emoji/, because these
files say "# Version: 13.0". Ruby keeps and provides both an Unicode
version and an Emoji version (available in Ruby via RbConfig::CONFIG
['UNICODE_VERSION'] and RbConfig::CONFIG['UNICODE_EMOJI_VERSION']). But
neither of them matches 13.0. For the files moved under
https://www.unicode.org/Public/13.0.0/ucd/emoji/, they really should
indicate the Unicode version, not the Emoji version, because they are
updated in sync with Unicode versions, and not updated when only Emoji
versions get updated.

Date/Time: Mon Jul 12 02:37:11 CDT 2021
Name: Martin J. Dürst
Report Type: Public Review Issue
Opt Subject: Issue 433: Unicode Version 14.0.0 public review: Results from testing on Ruby

This is to report that I have not found any bugs or issues in the Unicode
14.0.0 public beta when temporarily upgrading the programming language Ruby
to Unicode 14.0.0. This does not mean that any new characters or
properties, or changed property values have been checked for
appropriateness. It just means that as far as they are used, the data files
and the test data files (e.g. for normalization) provided for the new
version 14.0.0 are consistent as far as such consistency is checked when
testing the relevant facilities in Ruby. In case you are interested in
further details, please feel free to contact me.

Date/Time: Mon Jul 12 17:38:53 CDT 2021
Name: Peter Constable
Report Type: Public Review Issue
Opt Subject: UAX44, UTR23 and "string property"

The term "string property" is potentially ambiguous: it might mean a
property over the domain of strings, or a property with a co-domain of
strings, or both. 

UAX #44 appears to use "string property" to mean a property with a co-domain
of strings. E.g., "String properties are typically mappings from a Unicode
code point to another Unicode code point or sequence of Unicode code
points..."

PU UTR #23 introduces the notion of properties of strings (strings as
domain), and avoids the term "string property", using instead "property
applied to strings" or "property of strings". In the case of properties
with co-domain of strings, it uses clear wording, "string-valued
properties". This is helpful and good.

PU UTR #23 also calls out the terminology issue that exists in UAX #44: 

"Note: Properties classed in [UCDDoc] as type "String" are string-valued
 properties." 

PU UAX #44, however, does not provide similar clarification and
disambiguation. It should, particularly given that Unicode standards
closely associated with The Unicode Standard will include properties of
strings, and one could argue that UCD itself has properties with a domain
of string (e.g., StandardizedVariants.txt as a mapping from an enumerated
set of strings to boolean True).

Date/Time: Mon Jul 12 18:53:01 CDT 2021
Contact: dwanders@sonic.net
Name: Debbie Anderson
Report Type: Public Review Issue
Opt Subject: Glyph error U+FD44


I found an error in Arabic Pres Forms-A: the glyphs for FD43 and FD44 are 
the same. FD44 is incorrect.
(See https://www.unicode.org/L2/L2019/19289r-arabic-honorifics.pdf)

Date/Time: Tue Jul 13 16:02:19 CDT 2021
Name: Kent Karlsson
Report Type: Error Report
Opt Subject: BidiMirroring.txt

∉ ∌ # NOT AN ELEMENT OF
∌ ∉ # DOES NOT CONTAIN AS MEMBER

These should get the annotation [BEST FIT].

Date/Time: Wed Jul 14 05:08:18 CDT 2021
Name: Kent Karlsson
Report Type: Public Review Issue
Opt Subject: NamesList.txt

Proposed additional comments to NamesList.txt (marked with "proposed new
comment" on each proposed addition):

263D	FIRST QUARTER MOON
	= alchemical symbol for silver
	x (first quarter moon symbol - 1F313)
	* a crescent, not the first quarter   proposed new comment

263E	LAST QUARTER MOON
	= alchemical symbol for silver
	x (power sleep symbol - 23FE)
	x (last quarter moon symbol - 1F317)
	x (crescent moon - 1F319)
	* a crescent, not the last quarter   proposed new comment


1F311	NEW MOON SYMBOL
	x (black circle - 25CF)
1F312	WAXING CRESCENT MOON SYMBOL
	* waning crescent moon in the southern hemisphere   proposed new comment
1F313	FIRST QUARTER MOON SYMBOL
	= half moon
	x (circle with left half black - 25D0)
	x (first quarter moon - 263D)
	* last quarter moon in the southern hemisphere   proposed new comment
1F314	WAXING GIBBOUS MOON SYMBOL
	= waxing moon
	* waning gibbous moon in the southern hemisphere   proposed new comment
1F315	FULL MOON SYMBOL
	x (white circle - 25CB)
1F316	WANING GIBBOUS MOON SYMBOL
	* waxing gibbous moon in the southern hemisphere   proposed new comment
1F317	LAST QUARTER MOON SYMBOL
	x (circle with right half black - 25D1)
	x (last quarter moon - 263E)
	* first quarter moon in the southern hemisphere   proposed new comment
1F318	WANING CRESCENT MOON SYMBOL
	* waxing crescent moon in the southern hemisphere   proposed new comment

Date/Time: Wed Jul 14 05:11:02 CDT 2021
Name: Kent Karlsson
Report Type: Public Review Issue
Opt Subject: emoji-variation-sequences.txt

Proposed additions to emoji-variation-sequences.txt.

Apparently the emoji style is default, but in calendars it would usually be the text style.

Note that FULL MOON SYMBOL is already covered. One might add the crescent and gibbous ones,
but they are not common in calendars.

1F311 FE0E ; text style;  # (6.0) NEW MOON SYMBOL
1F311 FE0F ; emoji style; # (6.0) NEW MOON SYMBOL

1F313 FE0E ; text style;  # (6.0) FIRST QUARTER MOON SYMBOL
1F313 FE0F ; emoji style; # (6.0) FIRST QUARTER MOON SYMBOL

1F317 FE0E ; text style;  # (6.0) LAST QUARTER MOON SYMBOL
1F317 FE0F ; emoji style; # (6.0) LAST QUARTER MOON SYMBOL

Date/Time: Wed Jul 14 14:35:53 CDT 2021
Name: Kent Karlsson
Report Type: Public Review Issue
Opt Subject: Emoji handedness

Looking at https://www.unicode.org/emoji/charts-14.0/full-emoji-modifiers.html,
there seems to be a preference for right hand (also across vendors). There are
some, not so many, "hands" that are apparently left hand. Some are handedness
fixed by the name of the emoji, but most are not.

Is there any policy regarding handedness? If so which? Or are there any plans for
"handedness modifiers"?

Some hand gestures, though not so common, are meaningful only with a particular hand.

But someone may send a left-hand wave (say), but the receiver may get a right-hand
one. Often it might not matter, but sometimes it could matter; if for nothing else,
the sender may have a personal handedness preference not only in real life but also
for emoji.