Accumulated Feedback on PRI #442

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Sat Feb 12 14:53:23 CST 2022
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: 442

Proposed character U+1FAAF KHANDA is in every aspect identical to existing
character U+262C ADI SHAKTI – and deliberately so. I am aware of the reason
behind this duplicate encoding, as the ESC no longer wishes to emojify
already assigned codepoints.

However, if the concept of character identity still means anything, there
should at least be established a formal, immutable relationship between
these two identical characters. I propose defining a compatibility
decomposition mapping from KHANDA to ADI SHAKTI.

1FAAF;KHANDA;So;0;ON;<compat> 262C;;;;N;;;;;

This will not interfere with U+1FAAF’s main use as an emoji – twenty-two
other emoji characters already have decomposition mappings without issue,
and one even has a case mapping on top of that – but would allow systems to
formally recognise that KHANDA is in fact nothing more than a stylistic
variant of ADI SHAKTI and subsequently treat them as equivalent for certain
purposes such as loose-match searching. If one were to search a document
for all instances of “the Khanda symbol”, it would make little sense for
half of all such instances to be randomly omitted from the results.

While the preliminary code chart glyph for KHANDA looks distinct from ADI
SHAKTI (the former being drawn as an outline and the latter as solid), this
difference is not actually part of either character’s identity but simply a
whim of the glyph designer. A font supporting both ADI SHAKTI and KHANDA
(and these *are* going to exist) has no reason to make these characters
look distinct from one another because they are after all the exact same
symbol and only encoded separately due to some technicality involving the
new guidelines for emoji submissions.

Therefore, as there is no discernable difference between these two
characters, they should be compatibility equivalents of each other.

Date/Time: Mon Feb 14 05:12:36 CST 2022
Name: r12a
Report Type: Public Review Issue
Opt Subject: 442

Note: This issue has already been fixed in later drafts.

https://www.unicode.org/charts/PDF/Unicode-15.0/U150-13430.pdf 
has the subtitle "Damaga" instead of "Damaged".

Date/Time: Tue Feb 15 05:08:45 CST 2022
Name: Denis Moyogo Jacquerye
Report Type: Public Review Issue
Opt Subject: 442

The Latin letters for Malayalam transliteration 1DF25..1DF2A have "WITH
MID-HEIGHT LEFT HOOK" in their names, for example 1DF25 "LATIN SMALL LETTER
D WITH MID-HEIGHT LEFT HOOK".

This is inconsistent with 1DF11 "L WITH FISHHOOK" which has a identical
(but horizontally mirrored) fishook.

Either:

1. the 1DF25..1DF2A character names should use "WITH RIGHT FISHHOOK"
(or "WITH FISHHOOK" when there is no name conflict) to reflect the use of
the same shape as 1DF11 L WITH FISHHOOK but mirrored to the right (if this
is the same FISHHOOK as other characters with FISHHOOK in their names)

2. 1DF11 L WITH FISHHOOK should have a note indicating it is composed of
a "MID-HEIGHT RIGHT HOOK" instead of a "FISHHOOK" (if this is not the same
FISHHOOK as other characters with FISHHOOK in their names).

The following character names also have "WITH FISHHOOK" in their names:

ɾ 027E LATIN SMALL LETTER R WITH FISHHOOK
ɿ 027F LATIN SMALL LETTER REVERSED R WITH FISHHOOK
ʮ 02AE LATIN SMALL LETTER TURNED H WITH FISHHOOK
ʯ 02AF LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL
ᵳ 1D73 LATIN SMALL LETTER R WITH FISHHOOK AND MIDDLE TILDE
𐞩 107A9 MODIFIER LETTER SMALL R WITH FISHHOOK
𝼑 1DF11 LATIN SMALL LETTER L WITH FISHHOOK
𝼖 1DF16 LATIN SMALL LETTER R WITH FISHHOOK AND PALATAL HOOK

Date/Time: Mon Feb 28 11:36:43 CST 2022
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: 442

Proposed character U+13440 EGYPTIAN HIEROGLYPH MIRROR HORIZONTALLY currently
has general category Cf (Format). A more appropriate value would be Mn
(Nonspacing_Mark).

U+13440 applies only to the hieroglyph immediately preceding it, with
possibly a variation selector for rotation intervening. As such, it behaves
like a combining mark. Its line break property value would also need to
change from GL (Glue) to CM (Combining Mark) accordingly, and its
bidirectional class from L (Left_to_Right) to NSM (Nonspacing_Mark).

Date/Time: Mon Feb 28 11:52:19 CST 2022
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: 442

Proposed character U+1E04E MODIFIER LETTER CYRILLIC SMALL BARRED O currently
decomposes to U+04D9 CYRILLIC SMALL LETTER SCHWA, but the correct mapping
would be to U+04E9 CYRILLIC SMALL LETTER BARRED O.

Date/Time: Sat Mar 5 09:23:50 CST 2022
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: 442

Add 113.0 as a second kRSUnicode property value for U+31350 (Extension H). 
One of its sources, UTC-03123, specifies 113.0 as its radical-stroke count 
in its USourceData.txt record.

Date/Time: Mon Mar 7 14:37:54 CST 2022
Name: Eduardo Marín Silva
Report Type: Public Review Issue
Opt Subject: 442


With the addition of 11240 KHOJKI LETTER SHORT I, there is now a new 
confusable sequence: 11202 is now confusable with 11240+1122C. I write 
this as a reminder to mention this in the relevant documentation.
I would also like to link my recently submitted document reviewing the 
alpha code charts: https://www.unicode.org/L2/L2022/22056-uni15-alpha-resp.pdf 

Date/Time: Fri Mar 4 13:21:23 CST 2022
Name: Gregg Tavares
Report Type: Error Report
Opt Subject: PropList-15.0.0d2.txt

Is this correct?

FD3E          ; Pattern_Syntax # Pe       ORNATE LEFT PARENTHESIS
FD3F          ; Pattern_Syntax # Ps       ORNATE RIGHT PARENTHESIS

All the other brackets/parenthesis have Ps for left and Pe for right. 

Date/Time: Fri Apr 1 17:32:18 CDT 2022
Name: Eduardo Marín Silva
Report Type: Public Review Issue
Opt Subject: 442

Under the symbol for Gongong 1F77D I recommend adding a reference to 共 5171.

Date/Time: Fri Apr 8 17:49:03 CDT 2022
Name: Norbert Lindenberg
Report Type: Public Review Issue
Opt Subject: 442

The data for Kawi in the LineBreak.txt draft for Unicode 15 uses the South
East Asian style of context analysis for line breaking. This style implies
that a complex context-dependent analysis is required for Kawi. That is not
actually the case, as the proposal L2/20-284R documents line breaking at
orthographic syllable boundaries. That style of line breaking however isn't
actually supported in the Unicode line breaking algorithm in Unicode 15
yet.

For now, Kawi syllables should use the Western style to align with the
script's descendants Javanese, Balinese, and Sundanese. For punctuation, I
suggest using the values proposed in L2/22-080.

I propose the following changes:

11F00..11F01;SA   # Mn     [2] KAWI SIGN CANDRABINDU..KAWI SIGN ANUSVARA
→ change to CM
11F02;SA          # Lo         KAWI SIGN REPHA
→ change to AL
11F03;SA          # Mc         KAWI SIGN VISARGA
→ change to CM
11F04..11F10;SA   # Lo    [13] KAWI LETTER A..KAWI LETTER O
→ change to AL
11F12..11F33;SA   # Lo    [34] KAWI LETTER KA..KAWI LETTER JNYA
→ change to AL
11F34..11F35;SA   # Mc     [2] KAWI VOWEL SIGN AA..KAWI VOWEL SIGN ALTERNATE AA
→ change to CM
11F36..11F3A;SA   # Mn     [5] KAWI VOWEL SIGN I..KAWI VOWEL SIGN VOCALIC R
→ change to CM
11F3E..11F3F;SA   # Mc     [2] KAWI VOWEL SIGN E..KAWI VOWEL SIGN AI
→ change to CM
11F40;SA          # Mn         KAWI VOWEL SIGN EU
→ change to CM
11F41;SA          # Mc         KAWI SIGN KILLER
→ change to CM
11F42;SA          # Mn         KAWI CONJOINER
→ change to CM
11F43..11F4F;SA   # Po    [13] KAWI DANDA..KAWI PUNCTUATION CLOSING SPIRAL
→ change to BA for 11F43..11F44
→ change to ID for 11F45..11F4F
11F50..11F59;NU   # Nd    [10] KAWI DIGIT ZERO..KAWI DIGIT NINE
→ keep

Date/Time: Sun Apr 17 12:19:45 CDT 2022
Name: extc
Report Type: Public Review Issue
Opt Subject: PRI #442

2 characters of CJK Extension H are unifiable with existing CJK United Ideographs.

The first one is U+31F4C ⿱艹⿻廾丶, which is unifiable with U+2CECB 𬻋 = ⿱サ⿸サ丶
Both characters are short form of 菩提 used in Buddist scriptures. They are identical 
in meaning and unifiable variants.

The second one is U+31F68 ⿱艹邦, which is unifiable with U+26C25 𦰥 = ⿱艹⿰龵⻏
because ⿰龵⻏ is a variant of 邦 which you can find in Unicode Ideographic Variation Database.
(U+90A6 U+E0101). 

Please consider these 2 cases, thank you.