The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of October 25, 2024 - January 2, 2025, since the previous cumulative document was issued prior to UTC #182 (October 24, 2024).
The links below go directly to open PRIs and to feedback documents for them, as of January 2, 2025
The links below go to locations in this document for feedback.
Feedback routed to CJK & Unihan Working Group for evaluation [CJK]
Feedback routed to Script Encoding Working Group for evaluation [SEW]
Feedback routed to Properties & Algorithms Working Group for evaluation [PAG]
Feedback routed to Emoji Standard & Research Working Group for evaluation [ESC]
Feedback routed to Editorial Working Group for evaluation [EDC]
Other Reports
Date/Time: Thu Oct 31 10:11:14 CDT 2024
ReportID: ID20241031101114
Name: Andrew West
Report Type: Error Report
Opt Subject: Unihan_Variants.txt
In Unihan_Variants.txt there are these two entries: U+47B6 kSimplifiedVariant U+2C985 U+2C985 kTraditionalVariant U+47B6 However, U+47B6 䞶 (⿺走易) does not simplify to U+2C985 𬦅 (⿺走𠃓), which is actually the simplified form of U+27F2E 𧼮 (⿺走昜). Therefore remove these two entries: U+47B6 kSimplifiedVariant U+2C985 U+2C985 kTraditionalVariant U+47B6 And add these two entries: U+27F2E kSimplifiedVariant U+2C985 U+2C985 kTraditionalVariant U+27F2E In addition the kMandarin and kHanyuPinyin readings in Unihan_Readings.txt are swapped for U+47B6 and U+27F2E: U+27F2E kHanyuPinyin 53492.080:tì U+27F2E kMandarin tì U+47B6 kHanyuPinyin 53494.110:tāng,tàng U+47B6 kMandarin tāng U+27F2E should have the readings tāng and tàng (cf. kFanqie = 吐郎, kJapanese = トウ). U+47B6 should have the reading tì (cf. kJapanese = テキ チャク).
Date/Time: Wed Nov 20 20:54:41 CST 2024
ReportID: ID20241120205441
Name: Paul Masson
Report Type: Error Report
Opt Subject: Chinese encoding of U+48CB 䣋
The Chinese encoding of this character is currently ⿰釆阝, when it should be ⿰采阝.
Date/Time: Fri Dec 27 09:21:47 CST 2024
ReportID: ID20241227092147
Name: Eduardo Marín Silva
Report Type: Public Review Issue
Opt Subject: On the naming on some musical symbols
After inspecting the glyphs of certain symbols, I want to suggest names that better describe their glyphs for some of them: MUSICAL SYMBOL FLAT WITH STROKE -> MUSICAL SYMBOL FLAT WITH VERTICAL STROKE MUSICAL SYMBOL FLAT WITH DOUBLE STROKE -> MUSICAL SYMBOL FLAT WITH DOUBLE VERTICAL STROKE MUSICAL SYMBOL ARABIC THREE QUARTER TONE FLAT -> MUSICAL SYMBOL FLAT WITH DOUBLE HORIZONTAL STROKE MUSICAL SYMBOL HALF SHARP WITH STROKE -> MUSICAL SYMBOL HALF SHARP WITH LONG HORIZONTAL STROKE MUSICAL SYMBOL SHARP WITH STROKE -> MUSICAL SYMBOL SHARP WITH LONG HORIZONTAL STROKE
Date/Time: Fri Dec 27 11:09:09 CST 2024
ReportID: ID20241227110909
Name: Eduardo Marín Silva
Report Type: Public Review Issue
Opt Subject: On the new buzz roll combining character
The character that is provisionally assigned 1D25F follows the same enconding model as the tremolos (beign applied on top of the stem that is itself a combining character). But this character is never applied in the abscence of a stem, it would make more sense to treat it as another kind of stem: MUSICAL SYMBOL COMBINING STEM WITH BUZZ ROLL. I understand that this is not the case for the family of tremolos, but the new ones have to follow the same model as their predecesors for stability. Frankly, if I was around I would have insisted in enconding them as COMBINING STEM WITH TREMOLO, but it's too late for that. Also frankly, the current encoding model is not that problematic (mine only saves one codepoint by basically combining two characters), but I'm wondering if all further modified stems will be encoded like this.
Date/Time: Tue Jan 07 18:42:03 CST 2025
ReportID: ID20250107184203
Name: David Corbett
Report Type: Error Report
Opt Subject: L2/25-021
L2/25-021 “Proposal to encode the Devanagari Vowel Sign AAO in Unicode” says the proposed vowel sign cannot be represented by <U+093E, U+094B> because that sequence gets dotted circles in existing standard fonts. However, dotted circles are not inserted by the fonts: they are inserted by certain shaping engines, but it works fine in HarfBuzz. An alternative to L2/25-021 would be to fix the shaping engine.
Date/Time: Wed Oct 30 07:39:32 CDT 2024
ReportID: ID20241030073932
Name: Andrew West
Report Type: Error Report
Opt Subject: UTS #10 Unicode Collation Algorithm
UTS #10 Unicode Collation Algorithm defines implicit weights for Tangut ideographs and Tangut components (see Table 16 Computing Implicit Weights) with the following formulas: AAAA = 0xFB00 BBBB = (CP - 0x17000) | 0x8000 This worked OK when there were only a Tangut block and a Tangut Components block, but after the addition of the Tangut Supplement block in Unicode 13.0, the above formulas result in Tangut ideographs in the Tangut Supplement block sorting after all the Tangut components, rather than sorting immediately after the Tangut ideographs in the Tangut block, as would be expected by users. The situation will be even worse after the addition of the Tangut Components Supplement block in a future version of Unicode, when characters in the four Tangut blocks will be sorted in the following order: Tangut (17000..187FF) Tangut Components (18800..18AFF) Tangut Supplement (18D00..18D7F) Tangut Components Supplement (18D80..18DFF) The expected default sort order of Tangut ideographs and Tangut components should be: Tangut (17000..187FF) Tangut Supplement (18D00..18D7F) Tangut Components (18800..18AFF) Tangut Components Supplement (18D80..18DFF) This could be achieved by separately calculating the implicit weights for Tangut ideographs and Tangut components, as below: Assigned code points in Block=Tangut OR Tangut_Supplement: AAAA = 0xFB00 BBBB = (CP - 0x17000) | 0x8000 Assigned code points in Block=Tangut_Components OR Tangut_Components_Supplement AAAA = 0xFB01 BBBB = (CP - 0x18800) | 0x8000 Assigned code points in Block=Nushu: AAAA = 0xFB02 BBBB = (CP - 0x1B170) | 0x8000 Assigned code points in Block=Khitan_Small_Script: AAAA = 0xFB03 BBBB = (CP - 0x18B00) | 0x8000
Date/Time: Tue Dec 10 01:25:40 CST 2024
ReportID: ID20241210012540
Name: Simon Patrick
Report Type: Error Report
Opt Subject: /Public/draft/UCD/ucd/Blocks.txt
I know that this file is a very early draft for version 17.0 (file is dated 15 November 2024) but you might like to note that (a) I think the new Sidetic block should end at 1095F rather than 1095C and (b) the new Beria Erfe block (16EA0..16EDF) is not in its correct place in code point order: it should come between Medefaidrin (16E40..16E9F) and Miao (16F00..16F9F).
Date/Time: Sun Dec 22 08:33:16 CST 2024
ReportID: ID20241222083316
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: Feedback on an emoji candidate (L2/24-266r)
Regarding the proposed “Ballet Dancer” sequence: Presumably this is meant to be the gender-neutral counterpart to U+1F483 💃 DANCER and U+1F57A 🕺 MAN DANCING that has been missing for years, but I strongly feel that it is not suitable for that purpose. Firstly, gender variants in Unicode Emoji do not work like that. For two emoji to be gender variants of each other, they actually have to represent the same concept with the only difference between them being the appearance of the human person (and maybe also the colour of the clothes to make the contrast more noticeable at small sizes). 💃 and 🕺 are a long-standing exception to this rule because they predate most other gendered emoji and the idea wasn’t yet properly developed at the time, but adding a third incongruent variant into the mix is only going to make this problem worse. For all intents and purposes, these three emoji will not be seen as variations on a shared base concept like all other gender variants, but as three entirely disconnected emoji – each representing a different style of dance – that are inexplicably available as only a single gender each. This will make end users wonder why disco is only for men, flamenco is only for women, and neither men nor women can do ballet in the world of emoji. And we know for a fact that differently gendered ballet emoji are desired by users because that is precisely what L2/18-133, the original ballet dancer proposal that was resurrected for this endeavour, had suggested. I understand all too well that the UTC is reluctant to approve additional gendered emoji, but pretending that ballet is somehow the nonbinary counterpart to flamenco and disco dancing cannot be the solution. At this point it would honestly be best to fully decouple DANCER and MAN DANCING from each other, explicitly define them to be flamenco and disco dancers respectively instead of generic “people dancing”, and then give both of them new gender variants to complete the set. I believe the experiment has failed; these two emoji look too different from each other to form a pair and users are generally unreceptive to significant emoji glyph changes, so bringing their designs closer together is probably not a viable option. Either that or add a third gender-neutral dancer that is just as generic as the other two. Its outfit could be almost anything as long as its stance is recognisable as dancing, which would allow it to cover a lot of different styles instead of being confined to just ballet. Besides, didn’t Unicode 14.0 set the precedent that gender variants will no longer be handled as ZWJ sequences but rather as atomic characters? U+1FAC5 🫅 PERSON WITH CROWN could easily have been a combination of 🧑 and 👑, but was assigned its own code point instead. Putting gender aside, I also question the necessity of a “Person Dancing Ballet” emoji in general. U+1FA70 🩰 BALLET SHOES already exists; what does a pictograph showing a person wearing these shoes bring to the table that isn’t covered by the shoes on their own? I thought proposed emoji were supposed to “break new ground” and be distinctive. The fact that only U+1FA70 and not the human-form ballet sequences made it onto the RGI list in the first place shows me that the UTC also didn’t consider them worthwhile. A similar fate befell human-form ZWJ sequences incorporating U+1F933 🤳 SELFIE (L2/16-333) and U+1FA9E 🪞 MIRROR (L2/19-099), which functionally would have been duplicates because they would have represented nothing original compared to just U+1F933 or U+1FA9E in isolation. What makes the ballet dancer different in this regard?
Date/Time: Thu Nov 07 15:20:49 CST 2024
ReportID: ID20241107152049
Name: Jim DeLaHunt
Report Type: Website Problem
Opt Subject: Unicode standard list of components
The section of the Unicode 16.0 page, section I. "List of Components" <https://www.unicode.org/versions/Unicode16.0.0/#Components>, has an entry "Core Specification" which links only to the PDF version of Unicode 16.0. It lacks a link to the HTML version of the Core Specification. It is ironic that the component missing from the list of components is the one which is authoritative.
(None at this time.)