Author: Ken Whistler
Date: July 22, 2018
The UTC delegated to me the task of responding to the voluminous feedback of Marcel Schneider on PRI #372. To make this task tractable, the following excerpt of the PRI #372 quotes all of Marcel Schneider's feedback. I have interspersed commentary, on an item-by-item basis, suggesting dispositions. Some of the dispositions note changes that were in fact already made editorially in the names list (and charts) for Unicode 11.0. Note that this feedback was too voluminous and scattered for the UTC itself to attempt to digest and respond during the limited UTC plenary meeting time devoted to review of PRI #372 feedback, and further changes for dispositions had to wait until after the completion of the Unicode 11.0 release.
Date/Time: Thu Jan 25 07:50:03 CST 2018
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Misleading/wrong/missing specifications
U+2012 FIGURE DASH U+2012 FIGURE DASH should be specified as centered on lining digits, since it represents the minus sign in old-style typeset tables. This missing specification lead the designers of all fonts Iʼve checked, to make U+2012 a duplicate of U+2013, making it de facto useless. (Please compare with my previous feedback about U+2012.) TUS is wrong when stating that U+2012 has mixed semantics of U+002D, since it is NOT primarily a hyphen, NOR an en-dash, but a minus sign, and should be designed as such, i.e. centered on lining digits in fonts with lining (uppercase) digits, and centered on lowercase letters ONLY in fonts with lowercase (Elzeviran) digits. Consequently, fonts providing both lining and lowercase digits MUST provide two according glyphs for FIGURE DASH, and toggle between the two depending on that flag. All that should be specified in the Standard, and should have been so from the beginning on, as a guideline for inadvertant font designers.
Disposition: Mr. Schneider is invited to submit a proposal to the UTC to clarify the semantics and rendering of U+2012 FIGURE DASH. These suggestions are not something which can just be dealt with editorially.
U+279D AND U+2B62 TRIANGLE-HEADED RIGHTWARDS ARROWS U+279D TRIANGLE-HEADED RIGHTWARDS ARROW and U+2B62 RIGHTWARDS TRIANGLE- HEADED ARROW have been made confusable due to misnaming of the former. It is good practice to start arrow names with the direction. Thus the set in the Miscellaneous Symbols and Arrows block has well-formed names, while the Dingbat arrows names are biased because in that range, almost all arrows are rightwards. To fix that name confusion, adding the cross-references is not enough. An informative alias should be added to U+279D, calling it THIN TRIANGLE-HEADED RIGHTWARDS ARROW, as opposed to the next: U+279E HEAVY TRIANGLE-HEADED RIGHTWARDS ARROW, and according to the chart glyphs. And an annotation to U+2B62 (“confusable with 279D”) would also be helpful.
Disposition: Delegated to the editors.
U+279C HEAVY ROUND-TIPPED RIGHTWARDS ARROW The chart glyph of U+279C does not reflect the character identity, as it is not round-tipped, only round-barbed, like in U+27BA TEARDROP-BARBED RIGHTWARDS ARROW actually all strokes (barbs and stem) are ending in teardrops. However current fonts show this arrow actually round-tipped, giving it the intended (and consistent) design. Hence the chart glyph of U+279C could use an update. And not only that. It is really wrong, being not round-tipped. It should never have made it into the Code Charts.
Disposition: Moot. Character and glyph are from the original Dingbat collections.
Date/Time: Sun Apr 15 16:13:57 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI #372 (consolidated feedback)
------------------------------------------------------------------------------------------------- U+0588 ֈ ARMENIAN SMALL LETTER YI WITH STROKE This character has a name that from an extrinsic point of view compromises character identity, so far as any distinction is made between a stroke and a bar. However, intrinsically, calling it “with stroke” is justified by (already earlier misnamed) U+0249 LATIN SMALL LETTER J WITH STROKE. Actually both of these letters are *with bar*, as can be induced from other pairs such as L and U with either stroke or bar, graphically well distinguished. Annotations should be added to all misnamed letters on a stroke‐bar confusion basis, to help translators fix these flaws on localized versionsʼ level. Missing informative aliases are responsible for terminological flaw spreading and contaminating other locales, such as French. Unicode has a powerful means to enforce correct understanding of character identity, thanks to house policies protecting characters against character identity corruption. Note: Appropriate name change had already been requested at PRI #352: Feedback on draft additional repertoire for Amendment 1.3 (PDAM) to ISO/IEC 10646:2017. See I.12 in L2/17-288: https://www.unicode.org/L2/L2017/17288-pri-comments.pdf
Disposition: Names with "BAR" and "STROKE" are arbitrary and long-embedded in the standard. It is not appropriate to treat the names list as requiring extensive annotation notes simply for the purpose of assisting translation of character names into French or other languages.
------------------------------------------------------------------------------------------------- U+05EF ׯ HEBREW YOD TRIANGLE See my PDAM ballot stage feedback, recommending also to change the header to “Logograph”. This sign is a writing convention replacing the Holy Name. See in the original Proposal. Deleting this sign is thus equivalent with deleting the Name, and the intended security fails. Therefore, an annotation should be added to prevent people from *intentionally* deleting this sign. Note: Appropriate name change had already been requested at PRI #352: Feedback on draft additional repertoire for Amendment 1.3 (PDAM) to ISO/IEC 10646:2017. See I.13 in L2/17-288.
Disposition: Such a note is not necessary, nor within the reasonable scope of the names list annotations.
------------------------------------------------------------------------------------------------- U+2BFE ⯾ REVERSED RIGHT ANGLE = without → 221F ∟ right angle This *new* pair is missing in BidiMirroring-11.0.0d3.txt, while other angle symbols are present. Alongside encoding new characters, bidi‐mirroring pairs should be added to the repertoire of the bidi‐mirroring‐glyph=yes repertoire as they get matching, provided that they conform to the requirement of ensuring readability in the absence of RTL glyph handling. Cf. feedback iteration in my previous post (off‐PRI). I note that this encoding fulfills item 1 of Table 14 in Remedial 19 in: http://www.unicode.org/L2/L2017/17438-bidi-math-fdbk.html Note that items 2 through 8 seem to be still missing in Unicode. Please refer to Table 14.
Disposition: U+2BFE/U+221F were added to BidiMirroring.txt for 11.0. The UTC has declined to implement other suggestions made in L2/17-438.
------------------------------------------------------------------------------------------------- U+A8FE ꣾ DEVANAGARI LETTER AY U+A8FF ◌ꣿ DEVANAGARI VOWEL SIGN AY These two characters are under the subhead “Additional vowel and vowel sign.” This conforms to a practice followed in the code charts, where independent vowels have names on LETTER and are headed with “Independent vowels,” while combining vowels have names on VOWEL SIGN and are headed with “Dependent vowel signs.” This is multiply inconsistent and uselessly complicated: ① Combining characters are usually referred to as “marks” in the Unicode standard. Using “signs” when referring to combining vowels in Brahmic scripts is a misleading inconsistency, as this would mean they are really what everywhere else in the Standard is called a “sign,” i.e. a symbol (e.g. the dollar sign; see the rationale of naming the copyleft symbol). ② Calling letter vowels “independent vowels” induces calling combining vowels “dependent vowels.” The other way around, calling combining vowels “dependent vowel signs” implicates that independent vowels are called either “independent vowel letters” or “independent vowel signs.” Ultimately, the new Devanagari Extended subheading “Additional vowel and vowel sign” results from a clash of colliding inconsistencies. Subheadings like in the Kharoshti block, e.g. “Vowels” before U+10A00 — a range containing both independent and dependent vowels, like the discussed Devanagari range — are proving that correct subheadings are already implemented in the Standard. Even in the main Devanagari block, the subheading before U+0960 is reading “Additional vowels for Sanskrit,” not “Additional vowels and vowel signs for Sanskrit,” although the range actually encompasses both independent (vocalic rr and ll) and dependent vowels (vocalic l and ll). Consequently, it is recommended that Devanagari Extended subheadings follow the same scheme. Solution: Change the subheading before U+A8FE from “Additional vowel and vowel sign” to “Additional vowels”. Harmonize relevant subheadings in all blocks containing Brahmic scripts. E.g. change “Dependent vowel signs” to “Dependent vowels” (e.g. before U+093A).
Disposition: The editors reviewed and declined this suggestion. Current usage for these subheads is consistent and clear already.
Please complete with next item. ------------------------------------------------------------------------------------------------- U+11145 ◌ᅅ CHAKMA VOWEL SIGN AA U+11146 ◌ᅆ CHAKMA VOWEL SIGN EI These new combining vowels are grouped under the subheading “Dependent vowel signs.” Despite of “Dependent vowel sign[|s]” now occurring 55 times in the Code charts as a subheading, it represents the wrong option, as opposed to “Dependent vowels” — already present in the following blocks: Oriya (before U+0B62), Telugu (U+0C62), Kannada (U+0CE2), Malayalam (U+0D62), and Lepcha (U+1C26). The reason is that the concept of a “vowel sign” is proper to the writing system / encoding and calls for attributes like “combining” (whose opposite is “independent”), whereas a “vowel” is more polysemic, being mainly a linguistic entity — here, attributes like “dependent” and “independent” may apply — along with its use in writing and encoding, as a simple alternative to the more precise (and, depending on context, needlessly precise) “vowel sign.” Obviously the Unicode terminology is biased here by the superfluous presence of “SIGN” in the character names of most combining vowels in Brahmic scripts. Therefore in this context, when a subheading starts with “Dependent,” it should end in “vowels,” not in “vowel signs.” That brings the need to correct this and the other instances. Please consider this item along with the previous one.
Disposition: The editors reviewed and declined this suggestion.
------------------------------------------------------------------------------------------------- HANIFI ROHINGYA U+10D00..U+10D3F I wouldn’t comment [1] on the meritorious encoding of Hanifi Rohingya script, that is helping me in that it has vowel names without SIGN, simply VOWEL, like already in Tai Viet where seven of the vowels are combining marks (Gc=Mn), while eight are Gc=Lo although only five do precede consonants in visual and logical order. Hanifi Rohingya block is another template of streamlined vowel names. Iʼm sensitive because VOWEL SIGN in character names is untranslatable to French. Historically it ended up as “DIACRITIQUE VOYELLE”, which I proposed to rather replace with “VOYELLE COMBINANTE.” That however didnʼt gain traction, though admittedly the issue raised concerns. This is one more reason to be amazed to see Hanifi Rohingya having vowels without SIGN in their name. [1] Otherwise there would be to mention that a diacritic indicating a tone is called a “tone mark” throughout the standard (12 ranges) and typically has a name including the word TONE. That in turn translates well (MARQUE TONALE), so this asperity is palliatable in Code Charts translations.
Disposition: Noted. Please be aware that the Unicode Standard standardizes the characters and their names; it does not standardize the editorial subheads used in the names list, nor does it standardize the translation of subheads used in the names list. The UTC is getting rather tired of lectures about translation of the names list, and is unlikely to be receptive to this kind of feedback in the future.
------------------------------------------------------------------------------------------------- SOGDIAN U+10F30..U+10F6F The diacritics in range U+10F46..10F50 are categorized as “combining signs” in the Proposal (§3.3) and as “Modifier signs” in the relevant delta code chart actually under beta review: http://www.unicode.org/charts/PDF/Unicode-11.0/U110-10F30.pdf In the Unicode standard, the term “modifier” is used in conjunction with “letter” for independent characters. Combining characters in turn are known as “marks” as in “non‐spacing combining mark” and “spacing combining mark.” Hence Code chart readers would be most likely to expect the Sogdian diacritics under a subheading such as “Diacritics” (cf. those before U+07F2, U+0859, U+1CE2, U+302A, and of course U+0300) or “Combining marks” (U+135D, U+2CEF, U+A6F0, U+10AE5). To date, in the Code charts, “Modifier sign” is newly introduced by encoding the Sogdian block. Elsewhere on the internet, the term is used in programming, though fairly seldom.
Disposition: Noted. The actual published subhead for Sogdian is "Combining Marks".
------------------------------------------------------------------------------------------------- U+110CD KAITHI NUMBER SIGN ABOVE This new character follows a range of “Various signs” but leaving out some unassigned codepoints for a handy row shift that enhances legibility of the code chart. Hence the subheading has been repeated: “Sign.” There is however an issue with the design of the last ranges in this block. U+110C0 and U+110C1 are the danda and double danda for Kaithi, like in a number of other Brahmic scripts that donʼt use the Devanagari punctuations for the dandas. In every single block of these scripts having extra dandas, *** DANDA and *** DOUBLE DANDA are in a “Punctuation” range. Kaithi is the only block in the Standard where script‐specific dandas are merged with other signs under a generic subheading. In order to harmonize the presentation of the Kaithi block, avoiding an impression of neglectedness, I recommend to modify the subheadings as follows. Iʼd also add a cross‐reference to U+110BC as results from a mention in the encoding proposal, p. 34 (p. 39 of the PDF): https://www.unicode.org/L2/L2008/08194-n3389-kaithi.pdf @ Various signs 110B9 KAITHI SIGN VIRAMA 110BA KAITHI SIGN NUKTA 110BB KAITHI ABBREVIATION SIGN @ Number signs 110BC KAITHI ENUMERATION SIGN x (numero sign - 2116) 110BD KAITHI NUMBER SIGN * used to indicate a numerical reference @ Punctuation 110BE KAITHI SECTION MARK * marks end of sentence x (khojki section mark - 1123B) 110BF KAITHI DOUBLE SECTION MARK * delimits larger chunks of text, such as paragraphs x (khojki double section mark - 1123C) 110C0 KAITHI DANDA 110C1 KAITHI DOUBLE DANDA @ Number sign 110CD KAITHI NUMBER SIGN ABOVE * used to indicate a number in an itemized list
Disposition: The editors reviewed and declined this suggestion.
------------------------------------------------------------------------------------------------- U+11A9D SOYOMBO MARK PLUTA This character has a generic range heading, while U+11A98 SOYOMBO GEMINATION MARK has a specific one, that is already a hapax; and both instances are single character ranges. Suggestion: Change “Additional mark” to “Elongation mark.”
Disposition: The editors reviewed and implemented this suggestion.
------------------------------------------------------------------------------------------------- GUNJALA GONDI U+11D60..U+11DAF The first range heading must be “Independent vowels” like everywhere else in the Code Charts in such a script configuration, not just “Vowels” (which is also misleading). “Dependent vowel signs” should be changed to “Dependent vowels” (see other comment). U+11D98 GUNJALA GONDI OM range heading could be “Invocation sign” following U+11449 NEWA OM.
Disposition: The editors reviewed and declined these suggestions.
------------------------------------------------------------------------------------------------- U+11D97 GUNJALA GONDI VIRAMA The annotation “used for producing conjuncts” accords with that of U+11D45 MASARAM GONDI VIRAMA. However that of U+11133 CHAKMA VIRAMA, “used to form conjuncts,” looks simpler English. Suggestion: Replace both instances (MASARAM GONDI and GUNJALA GONDI) whith: * used to form conjuncts By contrast, should “used to form conjuncts” in this context be poor English (due to ambiguous semantics), I strongly recommend to equalize all instances on the “used for producing conjuncts” template. Please note BTW that in French there is an attempt to harmonize such iterations anyway (“sert à former des ligatures”), trying to hinder problems in the English version from impacting localized versions. Cf. draft preview 10.0.0: http://docucaras.info/#u11D45
Disposition: The editors reviewed and implemented this suggestion.
------------------------------------------------------------------------------------------------- MAYAN NUMERALS U+1D2E0..U+1D2FF Range headings usually donʼt include the script name. In this block, the (only) range heading is a replication of the block name. Recommendation: Change “Mayan numerals” (subheading) to “Numerals”.
Disposition: The editors reviewed and declined this suggestion.
-------------------------------------------------------------------------------------------------
Date/Time: Sun Apr 15 13:23:28 CDT 2018
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: General feedback, pending, consolidated
Hello, Thank you for reminding beta will close soon. Several pieces are in project but cannot be worked out due to other urgencies. ================================================================================================== BIDI‑MIRRORNG PAIRS FEEDBACK ITERATION Second, the bidi mirroring pairs including tildes should be locked out from bidi-mirrored‐glyph=yes feature. That request is already documented in a feedback item that was submitted in time for UTC meeting #154 and posted to the registry *before* that meeting, listed in the meeting agenda, but *not* considered: http://www.unicode.org/L2/L2017/17438-bidi-math-fdbk.html Please note that this is revision 7 of January 18, 2018, superseding failed 18/026 (January 15). Quote from section 3.1: Whether tildes are mirrored or not, does matter in typography, but mostly not for readability. When writing direction changes, switching the < >-like operators is absolute priority, whatever environment the text is displayed in. Therefore, the missing best-fit pairs should be added either to BidiMirroring.txt, or to the new *BidiMirroringExtended.txt. However, when discussing the requirements for tilde rendering, there is a need to underscore the semantic difference in three pairs of symbols that exist with tilde and with reversed tilde. Two of these pairs are mirrored by glyph exchange, while the third pair like all other tilde symbols is mirrored by RTL glyphs only (Table 5). Again, that works fine in publishing, when all tildes are mirrored anyway. But as glyph-exchange bidi-mirroring is not designed just as a convenience to streamline high-end rendering algorithms, but as a last resort to facilitate a usable display in whatever environment, there is scarcely any point in mirroring just two pairs, because the effect would be to merge the reversed tildes among the unmirrored ones, while the normal tildes stand out as if they were reversed (Figure 4). REMEDIAL 11: In BidiMirroring.txt: Remove the pairs in Table 6 from the pair mapping list in order to equalize the mirroring behavior of all operators with tilde or reversed tilde. Table 6. Mirror pairs of operators with tilde or reversed tilde, to be unpaired for consistency 1 223C ∼ TILDE OPERATOR and 223D ∽ REVERSED TILDE 2 2243 ≃ ASYMPTOTICALLY EQUAL TO and 22CD ⋍ REVERSED TILDE EQUALS
Disposition: The declined declined this suggestion already in L2/17-438, and this repetition of the request hasn't changed anything about that decision.
\================================================================================================== UNICODE 11.0 BETA REVIEW FEEDBACK ITEMS, consolidated, will follow in a separate post. Best regards, Marcel
Date/Time: Tue Apr 17 04:50:57 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI 372 addendum as an update of lastly posted item
Update wrt BidiMirroring-11.0.0d3.txt: To the quoted Table 6. “Mirror pairs of operators with tilde or reversed tilde, to be unpaired for consistency,” the following pair should be added, as it has recently been given the BidiMirroredGlyph=yes property, against recommendation in L2/17-438: #3: 2245 ≅ APPROXIMATELY EQUAL TO and 224C ≌ ALL EQUAL TO When sending “BIDI‑MIRRORNG PAIRS FEEDBACK ITERATION” (Sun Apr 15 13:23:28 CDT 2018), I believed that this is clear from the context referring to “while the third pair like all other tilde symbols is mirrored by RTL glyphs only (Table 5).” Note: This information was available to the UTC when considering the *superseded* L2/18-026 — that was listed *after* the up‐to‐date L2/17-438 in meeting agenda #154 — in section 3.1 With tilde or question mark.
Disposition: The declined declined this suggestion already in L2/17-438, and this repetition of the request hasn't changed anything about that decision.
Date/Time: Tue Apr 17 05:58:35 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI 372 NamedSequencesProv.txt
The sequences listed in: http://unicode.org/mail-arch/unicode-ml/y2016-m02/0071.html should be added to NamedSequencesProv.txt as recommended in: http://unicode.org/mail-arch/unicode-ml/y2016-m02/0072.html following the process specified in: http://www.unicode.org/reports/tr34/ Section 3.1.
Date/Time: Tue Apr 24 02:12:15 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI 372 NamedSequencesProv.txt
The following list of named sequences is made up from the list provided by Mats Blakstad on the Public Mailing List: http://unicode.org/mail-arch/unicode-ml/y2016-m02/0071.html # Additions for languages in Togo. LATIN CAPITAL LETTER A WITH TILDE AND GRAVE ACCENT;00C3 0300 LATIN SMALL LETTER A WITH TILDE AND GRAVE ACCENT;00E3 0300 LATIN CAPITAL LETTER A WITH TILDE AND ACUTE ACCENT;00C3 0301 LATIN SMALL LETTER A WITH TILDE AND ACUTE ACCENT;00E3 0301 LATIN CAPITAL LETTER E WITH TILDE AND GRAVE ACCENT;1EBC 0300 LATIN SMALL LETTER E WITH TILDE AND GRAVE ACCENT;1EBD 0300 LATIN CAPITAL LETTER E WITH TILDE AND ACUTE ACCENT;1EBC 0301 LATIN SMALL LETTER E WITH TILDE AND ACUTE ACCENT;1EBD 0301 LATIN CAPITAL LETTER TURNED E WITH GRAVE ACCENT;018E 0300 LATIN SMALL LETTER TURNED E WITH GRAVE ACCENT;01DD 0300 LATIN CAPITAL LETTER TURNED E WITH ACUTE ACCENT;018E 0301 LATIN SMALL LETTER TURNED E WITH ACUTE ACCENT;01DD 0301 LATIN CAPITAL LETTER TURNED E WITH CIRCUMFLEX ACCENT;018E 0302 LATIN SMALL LETTER TURNED E WITH CIRCUMFLEX ACCENT;01DD 0302 LATIN CAPITAL LETTER TURNED E WITH TILDE;018E 0303 LATIN SMALL LETTER TURNED E WITH TILDE;01DD 0303 LATIN CAPITAL LETTER TURNED E WITH TILDE AND GRAVE ACCENT;018E 0303 0300 LATIN SMALL LETTER TURNED E WITH TILDE AND GRAVE ACCENT;01DD 0303 0300 LATIN CAPITAL LETTER TURNED E WITH TILDE AND ACUTE ACCENT;018E 0303 0301 LATIN SMALL LETTER TURNED E WITH TILDE AND ACUTE ACCENT;01DD 0303 0301 LATIN CAPITAL LETTER TURNED E WITH MACRON;018E 0304 LATIN SMALL LETTER TURNED E WITH MACRON;01DD 0304 LATIN CAPITAL LETTER TURNED E WITH CARON;018E 030C LATIN SMALL LETTER TURNED E WITH CARON;01DD 030C LATIN CAPITAL LETTER OPEN E WITH GRAVE ACCENT;0190 0300 LATIN SMALL LETTER OPEN E WITH GRAVE ACCENT;025B 0300 LATIN CAPITAL LETTER OPEN E WITH ACUTE ACCENT;0190 0301 LATIN SMALL LETTER OPEN E WITH ACUTE ACCENT;025B 0301 LATIN CAPITAL LETTER OPEN E WITH CIRCUMFLEX ACCENT;0190 0302 LATIN SMALL LETTER OPEN E WITH CIRCUMFLEX ACCENT;025B 0302 LATIN CAPITAL LETTER OPEN E WITH TILDE;0190 0303 LATIN SMALL LETTER OPEN E WITH TILDE;025B 0303 LATIN CAPITAL LETTER OPEN E WITH TILDE AND GRAVE ACCENT;0190 0303 0300 LATIN SMALL LETTER OPEN E WITH TILDE AND GRAVE ACCENT;025B 0303 0300 LATIN CAPITAL LETTER OPEN E WITH TILDE AND ACUTE ACCENT;0190 0303 0301 LATIN SMALL LETTER OPEN E WITH TILDE AND ACUTE ACCENT;025B 0303 0301 LATIN CAPITAL LETTER OPEN E WITH MACRON;0190 0304 LATIN SMALL LETTER OPEN E WITH MACRON;025B 0304 LATIN CAPITAL LETTER OPEN E WITH CARON;0190 030C LATIN SMALL LETTER OPEN E WITH CARON;025B 030C LATIN CAPITAL LETTER I WITH TILDE AND GRAVE ACCENT;0128 0300 LATIN SMALL LETTER I WITH TILDE AND GRAVE ACCENT;0129 0300 LATIN CAPITAL LETTER I WITH TILDE AND ACUTE ACCENT;0128 0301 LATIN SMALL LETTER I WITH TILDE AND ACUTE ACCENT;0129 0301 LATIN CAPITAL LETTER IOTA WITH GRAVE ACCENT;0196 0300 LATIN SMALL LETTER IOTA WITH GRAVE ACCENT;0269 0300 LATIN CAPITAL LETTER IOTA WITH ACUTE ACCENT;0196 0301 LATIN SMALL LETTER IOTA WITH ACUTE ACCENT;0269 0301 LATIN CAPITAL LETTER IOTA WITH CIRCUMFLEX ACCENT;0196 0302 LATIN SMALL LETTER IOTA WITH CIRCUMFLEX ACCENT;0269 0302 LATIN CAPITAL LETTER IOTA WITH MACRON;0196 0304 LATIN SMALL LETTER IOTA WITH MACRON;0269 0304 LATIN CAPITAL LETTER IOTA WITH CARON;0196 030C LATIN SMALL LETTER IOTA WITH CARON;0269 030C LATIN CAPITAL LETTER M WITH GRAVE ACCENT;004D 0300 LATIN SMALL LETTER M WITH GRAVE ACCENT;006D 0300 LATIN CAPITAL LETTER ENG WITH GRAVE ACCENT;014A 0300 LATIN SMALL LETTER ENG WITH GRAVE ACCENT;014B 0300 LATIN CAPITAL LETTER ENG WITH ACUTE ACCENT;014A 0301 LATIN SMALL LETTER ENG WITH ACUTE ACCENT;014B 0301 LATIN CAPITAL LETTER O WITH TILDE AND GRAVE ACCENT;00F5 0300 LATIN SMALL LETTER O WITH TILDE AND GRAVE ACCENT;00F5 0300 LATIN CAPITAL LETTER OPEN O WITH GRAVE ACCENT;0186 0300 LATIN SMALL LETTER OPEN O WITH GRAVE ACCENT;0254 0300 LATIN CAPITAL LETTER OPEN O WITH ACUTE ACCENT;0186 0301 LATIN SMALL LETTER OPEN O WITH ACUTE ACCENT;0254 0301 LATIN CAPITAL LETTER OPEN O WITH CIRCUMFLEX ACCENT;0186 0302 LATIN SMALL LETTER OPEN O WITH CIRCUMFLEX ACCENT;0254 0302 LATIN CAPITAL LETTER OPEN O WITH TILDE;0186 0303 LATIN SMALL LETTER OPEN O WITH TILDE;0254 0303 LATIN CAPITAL LETTER OPEN O WITH TILDE AND GRAVE ACCENT;0186 0303 0300 LATIN SMALL LETTER OPEN O WITH TILDE AND GRAVE ACCENT;0254 0303 0300 LATIN CAPITAL LETTER OPEN O WITH TILDE AND ACUTE ACCENT;0186 0303 0301 LATIN SMALL LETTER OPEN O WITH TILDE AND ACUTE ACCENT;0254 0303 0301 LATIN CAPITAL LETTER OPEN O WITH MACRON;0186 0304 LATIN SMALL LETTER OPEN O WITH MACRON;0254 0304 LATIN CAPITAL LETTER OPEN O WITH CARON;0186 030C LATIN SMALL LETTER OPEN O WITH CARON;0254 030C LATIN CAPITAL LETTER U WITH TILDE AND GRAVE ACCENT;0168 0300 LATIN SMALL LETTER U WITH TILDE AND GRAVE ACCENT;0169 0300 LATIN CAPITAL LETTER V WITH HOOK WITH GRAVE ACCENT;01B2 0300 LATIN SMALL LETTER V WITH HOOK WITH GRAVE ACCENT;028B 0300 LATIN CAPITAL LETTER V WITH HOOK WITH ACUTE ACCENT;01B2 0301 LATIN SMALL LETTER V WITH HOOK WITH ACUTE ACCENT;028B 0301 LATIN CAPITAL LETTER V WITH HOOK WITH CIRCUMFLEX ACCENT;01B2 0302 LATIN SMALL LETTER V WITH HOOK WITH CIRCUMFLEX ACCENT;028B 0302 LATIN CAPITAL LETTER V WITH HOOK WITH MACRON;01B2 0304 LATIN SMALL LETTER V WITH HOOK WITH MACRON;028B 0304 LATIN CAPITAL LETTER V WITH HOOK WITH CARON;01B2 030C LATIN SMALL LETTER V WITH HOOK WITH CARON;028B 030C LATIN CAPITAL LETTER UPSILONK WITH GRAVE ACCENT;01B1 0300 LATIN SMALL LETTER UPSILON WITH GRAVE ACCENT;028A 0300 LATIN CAPITAL LETTER UPSILON WITH ACUTE ACCENT;01B1 0301 LATIN SMALL LETTER UPSILON WITH ACUTE ACCENT;028A 0301 LATIN CAPITAL LETTER UPSILON WITH CIRCUMFLEX ACCENT;01B1 0302 LATIN SMALL LETTER UPSILON WITH CIRCUMFLEX ACCENT;028A 0302 LATIN CAPITAL LETTER UPSILON WITH MACRON;01B1 0304 LATIN SMALL LETTER UPSILON WITH MACRON;028A 0304 LATIN CAPITAL LETTER UPSILON WITH CARON;01B1 030C LATIN SMALL LETTER UPSILON WITH CARON;028A 030C
Disposition: The UTC does not approve long lists of named sequences without a strong reason to do so. Declined.
Date/Time: Wed Apr 25 14:06:31 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI #372 (consolidated feedback)
Hello, There is another *consolidated* list of feedback for the Code Charts, that is not specifically related to the 11.0.0 additions. Some are *revised* versions of items hastily submitted on April 30, 2017. Sadly this is only a (huge) subset of my queue, given that I’m late submitting this feedback Thanks, Marcel ------------------------------------------------------------------ C1 controls The glyphs of some C1 controls show acronyms of other aliases than those given in the Nameslist section of the Code charts: 008B <control> = PARTIAL LINE FORWARD has PLD 008C <control> = PARTIAL LINE BACKWARD has PLU 008D <control> = REVERSE LINE FEED has RI Suggestion: Add another informative alias to each one: 008B = PARTIAL LINE DOWNWARD 008C = PARTIAL LINE UPWARD 008D = REVERSE INTERLIGN
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+034F COMBINING GRAPHEME JOINER Some changes might make this instance better understandable to the Code Charts reader: @ Format control # replaces "Grapheme joiner" 034F COMBINING GRAPHEME JOINER = combining mark locker # added informative alias * commonly abbreviated as CGJ * may be considered a “joiner” only in that it prevents combining marks from reordering # comment line raised and reworded * has no visible glyph Remove: * the name of this character is misleading; it does not actually join graphemes
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+202F NARROW NO-BREAK SPACE This space is set apart (due to late encoding), so some usage annotations may seem desirable: 202F NARROW NO-BREAK SPACE * commonly abbreviated NNBSP * a narrow form of a no-break space, typically the width of a thin space or a mid space * Mongolian, Phags-Pa # added comment line * French: used to space punctuations # added comment line
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+2300 DIAMETER SIGN People often use LATIN SMALL LETTER O WITH STROKE as a fallback, by lack of DIAMETER SIGN on keyboards. So it might be a good idea to crossreference both, directing users to correct usage: x (latin small letter o with stroke - 00F8) # added crossreference x (empty set - 2205) 00F8 LATIN SMALL LETTER O WITH STROKE = o slash * Danish, Norwegian, Faroese, IPA x (diameter sign - 2300) # added crossreference
Disposition: These crossreferences are present in the 11.0 charts.
------------------------------------------------------------------ U+2327 X IN A RECTANGLE BOX People often misuse this because it comes first when browsing charmaps. So I’ve added a crossreference to the BALLOT BOX WITH X: = clear key x (ballot box with x - 2612) # added xref
Disposition: Added.
------------------------------------------------------------------ U+232C BENZENE RING Even though in the same block, the BENZENE RING WITH and without CIRCLE would be nice with a crossreference to each other: x (benzene ring with circle - 23E3) # added 23E3 BENZENE RING WITH CIRCLE x (benzene ring - 232C) # added
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+260A ASCENDING NODE There seems to be an equivalence between ascending node, libra, and sublimation. Accordingly, the informative alias: = alchemical symbol for sublimate can be corrected to: = alchemical symbol for sublimation and the relevant crossreference added: x (alchemical symbol for sublimation - 1F75E) like already in: 260B DESCENDING NODE = alchemical symbol for purify x (alchemical symbol for purify - 1F763) See also the already present crossreferences in: 1F75E ALCHEMICAL SYMBOL FOR SUBLIMATION x (ascending node - 260A) x (libra - 264E)
Disposition: Alias for 260A corrected.
------------------------------------------------------------------ Medical and healing symbols (2624..2625) These are also religious symbols, as both were attributes of deities, and the ankh keeps being used in religion. Therefore I’d suggest to merge this subheading with the subsequent one, and to add some informative aliases: @ Religious, political and medical symbols # modified subheading 2624 CADUCEUS = commercial # added * symbol of commerce and eloquence # added * symbolizes medecine in Northern America # added x (staff of aesculapius - 2695) x (alchemical symbol for caduceus - 1F750) 2625 ANKH = ansate cross # added = coptic cross # added * egyptian hieroglyph for “life” # added x (egyptian hieroglyph s034 - 132F9) # added # removed subheading 2626 ORTHODOX CROSS 2627 CHI RHO = Constantine's cross, Christogram x (coptic symbol khi ro - 2CE9) 2628 CROSS OF LORRAINE = patriarchal cross # added x (double dagger - 2021) # added 2629 CROSS OF JERUSALEM = simple cross potent * contrasts with the actual cross of Jerusalem, which adds a small crosslet at each corner x (alchemical symbol for vinegar - 1F70A) 262A STAR AND CRESCENT 262B FARSI SYMBOL = symbol of iran (1.0) 262C ADI SHAKTI = Gurmukhi khanda 262D HAMMER AND SICKLE 262E PEACE SYMBOL 262F YIN YANG x (tibetan symbol nor bu nyis -khyil - 0FCA)
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Emoticons (1F600..1F64F) Following Code Charts usage, a reference is added to the previously encoded emoji in the Miscellaneous symbols block. Cf. the relevant subheading: @ Emoticons @+ Many other emoticons are encoded in the Emoticons block starting at 1F600. Suggestion: @@ 1F600 Emoticons 1F64F @+ The emoticons have been organized by mouth shape to make it easier to locate the different characters in the code chart. @+ Some other emoticons are encoded in the Miscellaneous symbols block starting at 2600. # = added annotation; references to blocks may also use complete block ranges: (2600..26FF). x (white frowning face - 2639) # added x (white smiling face - 263A) # added x (black smiling face - 263B) # added
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+267E PERMANENT PAPER SIGN U+267F WHEELCHAIR SYMBOL I’d suggest adding aliases and an xref, e.g.: 267E PERMANENT PAPER SIGN = non‐acid paper # added x (infinity - 221E) # added 267F WHEELCHAIR SYMBOL = accessible place # added
Disposition: Aliases added.
------------------------------------------------------------------ U+AA40 CHAM LETTER FINAL K This and the following letters are under a @ Final letters subheading, that is unusual in the Code Charts and should therefore be replaced with the more common and more precise: @ Final consonants
Disposition: Subhead updated.
------------------------------------------------------------------ U+00DF LATIN SMALL LETTER SHARP S U+1E9E LATIN CAPITAL LETTER SHARP S The issues regarding these letters have already been reported. ------------------------------------------------------------------ IPA Extensions (0250..02AF) Given that this block includes also a subheading for @ IPA characters for disordered speech it seems inappropriate to make the first subheading a mere double of the block name. (For another point, see also the comment on the Mayan numerals subheading I’d previously submitted [above].) So the first subheading may be reworded to: @ Extensions for general phonetics Further, the French localization has added an annotation between the blockheading and that subheading, about terminology used in character names. Having completed the list, I’m proposing to port it back to the original Code Charts. The proposed French text is found at: http://docucaras.info#u0250 English transposition: @+ The IPA is enhanced using diacritics—among which a stroke should be diagonal, and a bar, horizontal—and transforms, mainly: inverted = following an horizontal axis of symmetry; reversed = vertical axis; turned = rotated by 180°; sideways = by 90°; clockwise = on the right; counterclockwise = on the left (majority). As of the actual annotation, I recommend to start it with a statement like: @+ Several letters of the IPA have become part of the orthographies of many languages, some of which are cited as examples. # added IPA includes basic Latin letters and a number of Latin or Greek letters from other blocks. # original; I’d cite the blocks namedly, but that doesn’t fit the # actual scheme applied to the English (Unicode) NamesList.
Disposition: Reviewed and declined by the editors. This kind of extensive discussion of IPA usage belongs in other documentation -- not in the names list annotations.
------------------------------------------------------------------ U+10AC8 MANICHAEAN SIGN UD This symbol (Gc=So) has been encoded amidst the alphabet without rationale besides that the abbreviated word represented by it mainly consists of a WAW: https://www.unicode.org/L2/L2011/11123r-n4029r-manichaean.pdf Perhaps its Gc was Lo before being shifted to So, but anyhow it’s hard to figure out why a compound should be classified with the base letters here, whereas everywhere else logograms are set apart. While very careful to take into account best practices and encoding principles in current use, the Original Proposer Team failed in designing the block in consistency with other blocks where alphabets are encoded in continuous ranges, e.g. Syriac letters (0710..072C). I see no reason, however, that Unicode should perpetuate the appearance of normality conveyed by not granting the MANICHAEAN SIGN UD an appropriate and convenient subheading, regardless whether shifting normality from appearance to classification would make the disruption stand out even more. (In other words: Having the UD between WAW and ZAYIN under the “Letters” subheading has only the appearance of normality, letting inadvertant Code Charts readers believe that there was a good reason to get things this way around. As soon as subheadings are adjusted to apply correct classification like almost everywhere else in the Code Charts (some of the comments I’ve previously submitted notwithstanding), everybody gets aware that there must have been a problem, not only those looking up the Gc and then grabbing the encoding proposal from the internet.) Sorry for being very explicit; I’m really afraid that Unicode could be reluctant to give the green light to correct e.g. this way: 10AC7 MANICHAEAN LETTER WAW @ Logogram # added (alternate: Sign) 10AC8 MANICHAEAN SIGN UD * represents the conjunction ẉ̇ “and” # added @ Letters # replicated 10AC9 MANICHAEAN LETTER ZAYIN
Disposition: Added to the Manichaean chart.
------------------------------------------------------------------ Basic Latin; Greek and Coptic; Cyrillic: Ranges dedicated to the basic alphabet or to diacriticized letters ordered by case In the Greek and Coptic block (0370..03FF), these and subsequent letters: 0388 GREEK CAPITAL LETTER EPSILON WITH TONOS, … 0391 GREEK CAPITAL LETTER ALPHA, … 03AA GREEK CAPITAL LETTER IOTA WITH DIALYTIKA, … 03AC GREEK SMALL LETTER ALPHA WITH TONOS, … 03B1 GREEK SMALL LETTER ALPHA, … 03CA GREEK SMALL LETTER IOTA WITH DIALYTIKA, … are altogether under one single subheading: @ Letters whereas in the Basic Latin block, we have an @ Uppercase Latin alphabet subheading and a @ Lowercase Latin alphabet subheading. This inequality of treatment (intersperse punctuation and symbols in the Basic Latin block notwithstanding) needs in my opinion to be corrected. Casing scripts having ranges ordered by case do need corresponding subheadings, mentioning the case. Further, blocks of scripts using precomposed letters do need to have the basic alphabet marked up as such. However, like in the Mayan numerals block, the subheadings do not need to repeat the script name. Hence, in the Basic Latin block (0000..007F), the word “Latin” should be removed from the subheadings.
Disposition: Reviewed and declined by the editors.
Next, in the Greek block, the following subheadings should be added or adjusted accordingly: @ Uppercase letters # modified 0388 GREEK CAPITAL LETTER EPSILON WITH TONOS, … @ Uppercase alphabet # added 0391 GREEK CAPITAL LETTER ALPHA, … @ Uppercase letters # replicated 03AA GREEK CAPITAL LETTER IOTA WITH DIALYTIKA, … @ Lowercase letters # added 03AC GREEK SMALL LETTER ALPHA WITH TONOS, … @ Lowercase alphabet # added 03B1 GREEK SMALL LETTER ALPHA, … @ Lowercase letters # replicated 03CA GREEK SMALL LETTER IOTA WITH DIALYTIKA, … Then, in the Cyrillic block (0400..04FF), marking up the Russian alphabet is already done: @ Basic Russian alphabet 0410 CYRILLIC CAPITAL LETTER A But we need a subheading at the start of the lowercase alphabet, too. To achieve this, one can simply remove the word “Basic” as this is implicit. So Unicode can get two subheadings: @ Russian uppercase alphabet @ Russian lowercase alphabet Another option (imo the preferred one) is to introduce a supplemental heading level like in the Musical symbols block: @ Kievan notation @+ The following range is specific to Kievan notation. @ Clef 1D1DE MUSICAL SYMBOL KIEVAN C CLEF That can be transposed to the Cyrillic block: @ Russian @+ These ranges are dedicated to the basic Russian alphabet @ Uppercase alphabet @ Lowercase alphabet
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Coptic Epact Numbers (102E0..102FF) Thinking that this block could use some annotation, I’d suggest that, given 0605 ARABIC NUMBER MARK ABOVE has already the comment line: * may be used with Coptic Epact numbers the Coptic block, that crossreferences already the Greek and Coptic block, could be granted a second annotation: @+ Coptic epact digits and numbers are coded in the Coptic Epact Numbers block.
Disposition: Added.
And the Coptic Epact Numbers block could be completed as follows: @@ 102E0 Coptic Epact Numbers 102FF @+ These characters, called “imported” (epact) or cursive, are an alternate representation of numbers in Coptic. # added @+ The number sign is unified with the Arabic number mark. # added (minimal) x (arabic number mark above - 0605) # added @ Sign 102E0 COPTIC EPACT THOUSANDS MARK
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Thai (0E00..0E7F) It seems to me that the first subheading in the Thai block is actually an annotation: @@ 0E00 Thai 0E7F @@+ @+ Based on TIS 620-2533. # plus sign and period added @ Consonants 0E01 THAI CHARACTER KO KAI
Disposition: The annotation was reworded.
------------------------------------------------------------------ U+0132 LATIN CAPITAL LIGATURE IJ U+0133 LATIN SMALL LIGATURE IJ I’m convinced that adding some more information here would be well done: 0132 LATIN CAPITAL LIGATURE IJ # 0049 004A 0133 LATIN SMALL LIGATURE IJ * Dutch * visible ligation may be font‐dependent # added * combining with 0301 results in both i and j bearing an acute accent # added # 0069 006A
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Bopomofo (3100..312F) This block could be completed usefully with a new first subheading: @@ 3100 Bopomofo 312F @+ See also the Bopomofo Extended block. # added final period @ Letters for Mandarin # added # alternate: Mandarin letters @+ Based on GB 2312 # converted to annotation 3105 BOPOMOFO LETTER B […] @ Dialect (non-Mandarin) letters # existing 312A BOPOMOFO LETTER V
Disposition: Annotations were adjusted.
------------------------------------------------------------------ Duployan (1BC00..1BC9F) The gerund “orientating” occurs 18 times in the Code Charts, all in Duployan. First instance: 1BC47 DUPLOYAN LETTER E * character rotates to match entry angle of preceding consonant * secondary orientating (left and down) * Sloan long a * Perrault short i, long e (with dot accent) x (duployan affix attached e hook - 1BC7A) However, the encoding proposal uses “orienting” — see: http://www.unicode.org/L2/L2010/10272r2-duployan.pdf Merriam-Webster does support both “orient” and “orientate” with intended semantics. The Word‐of‐the‐Day 2017-04-30 podcast reveals that "to orientate" undergoes criticism for having one syllable more. Being the newer one of the two, it thrives in British English. Google Search retrieves 436,000 instances of "to orientate", but 3,170,000 of "to orient". So given that the Original Proposer uses 28 times "orienting", and zero times "orientating", I suspect the shift is due to Unicode being committed to British English in the Code Charts. Or to the fact that "orientate" looks more technical. Whatever, I’d suggest to do a search‐and‐replace in NamesList.txt to replace "orientating" with "orienting".
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1BC43 DUPLOYAN LETTER OA * Pernin aw * Perrault aw could be merged to: * Pernin, Perrault: aw (adding a colon after the variant identifiers).
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Domino Tiles (1F030..1F09F) The Domino tile subheadingss are almost all wrong, since dominoes are named following the least value. True subheadings would be: @ Tiles with zero dots on the left side and so on. An annotation should be added, if the actual subheadings must be maintained.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Enclosed Ideographic Supplement (1F200..1F2FF) The issue here is a terminological flaw between "squared", i.e. surrounded by a square, and "square", i.e. square‐shaped: @ Squared hiragana from ARIB STD B24 # change to: @ Square hiragana from ARIB STD B24 1F200 SQUARE HIRAGANA HOKA = and others # <square> 307B 304B @ Squared katakana 1F201 SQUARED KATAKANA KOKO = here sign # <square> 30B3 30B3
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1F41B BUG The sample glyph is inconsistent with the character identity according to name. It should show an animal of the order of the hemiptera, kind of a beetle.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1F6A0 MOUNTAIN CABLEWAY U+1F6A1 AERIAL TRAMWAY U+1F6A1 is a misnomer, and glyph fits 1F6A0, while the latter does not really exist, as a cable this way is technically unfeasible, as it is too steep for a suspension railway. With the cable underneath, and the line figuring rails, this glyph could be recycled for a funicular emoji. In consistency with actual practice, the Code Charts would be worded as follows: 1F6A0 MOUNTAIN CABLEWAY = aerial tramway * two big shuttles 1F6A1 AERIAL TRAMWAY = gondola lift * small cabins circulating continuously The glyphs are then to be adjusted accordingly. References: https://en.wikipedia.org/wiki/Aerial_tramway#Terminology http://www.iemoji.com/view/emoji/861/travel-places/mountain-cableway http://www.iemoji.com/view/emoji/862/travel-places/aerial-tramway https://en.wikipedia.org/wiki/Gondola_lift
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1F46B MAN AND WOMAN HOLDING HANDS U+1F6BB RESTROOM Linking these emoji by crossreferencing them mutually seems displaced to me. 1F46B MAN AND WOMAN HOLDING HANDS x (restroom - 1F6BB) # remove […] 1F6BB RESTROOM = man and woman symbol with divider = unisex restroom x (man and woman holding hands - 1F46B) # remove
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1F6EC AIRPLANE ARRIVING This emoji has a wrong glyph, as planes don’t land by heading on the ground. See airport signage for reference.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1680 OGHAM SPACE MARK This character has Gc=Zs, so the subhead should be Space, not Punctuation. Stem notwithstanding.
Disposition: Subhead was adjusted.
------------------------------------------------------------------ U+005E CIRCUMFLEX ACCENT U+005F LOW LINE U+0060 GRAVE ACCENT U+007E TILDE U+00A8 DIAERESIS U+00AF MACRON U+00B0 DEGREE SIGN U+00B4 ACUTE ACCENT U+00B8 CEDILLA U+2017 DOUBLE LOW LINE These 10 characters have this comment line: * this is a spacing character This should be changed to: * this is an independent character The reason is that combining marks with Gc=Mc are spacing, too. Antonyms are: • "spacing" vs "non‐spacing" • "combining" vs "independent" By contrast, "spacing" is not a synonym of "independent", nor is "non‐spacing" a synonym of "combining".
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+0950 DEVANAGARI OM U+0AD0 GUJARATI OM U+0BD0 TAMIL OM U+0F00 TIBETAN SYLLABLE OM U+A8FD DEVANAGARI JAIN OM U+111C4 SHARADA OM U+11350 GRANTHA OM U+11449 NEWA OM U+114C7 TIRHUTA OM U+118FF WARANG CITI OM Om is one of the most important spiritual symbols in Hinduism. Unicode encourages unification with DEVANAGARI OM in all scripts that don’t have a distinctive glyph of their own for this syllable. Hence 10 OM characters are found in Unicode, one of which (Sharada) is discouraged. In the Code Charts, many of the OM characters do have their own subheading, but only one instance, in Newa script, has "Invocation" in it. This state of the art is not satisfactory, so I request the following changes. Quotations always include full ranges (on a per‐subheading basis). @ Sign # discard @ Invocation sign # substitute 0950 DEVANAGARI OM x (om symbol - 1F549) @ Various signs # discard @ Invocation sign # substitute 0AD0 GUJARATI OM @ Various signs # discard @ Invocation sign # substitute 0BD0 TAMIL OM @ Length mark # added 0BD7 TAMIL AU LENGTH MARK @ Syllable # discard @ Invocation sign # substitute 0F00 TIBETAN SYLLABLE OM @ Signs # discard @ Invocation signs # substitute A8FC DEVANAGARI SIGN SIDDHAM = siddhirastu * used at the beginning of texts as an invocation x (tibetan mark initial yig mgo mdun ma - 0F04) x (mongolian birga - 1800) x (sharada sign siddham - 111DB) A8FD DEVANAGARI JAIN OM @ Various signs # no change 111C1 SHARADA SIGN AVAGRAHA 111C2 SHARADA SIGN JIHVAMULIYA 111C3 SHARADA SIGN UPADHMANIYA 111C4 SHARADA OM * use of this character is discouraged * recommended sequence is 1118F 11180 @ Sign # discard @ Invocation sign # substitute 11350 GRANTHA OM @ Invocation signs # no change 11449 NEWA OM 1144A NEWA SIDDHI @ Various signs 114BF TIRHUTA SIGN CANDRABINDU 114C0 TIRHUTA SIGN ANUSVARA 114C1 TIRHUTA SIGN VISARGA 114C2 TIRHUTA SIGN VIRAMA = halant 114C3 TIRHUTA SIGN NUKTA 114C4 TIRHUTA SIGN AVAGRAHA 114C5 TIRHUTA GVANG = vedic anusvara 114C6 TIRHUTA ABBREVIATION SIGN @ Invocation sign # added 114C7 TIRHUTA OM @ Sign # discard @ Invocation sign # substitute 118FF WARANG CITI OM
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Box Drawing (2500..257F) There are two ranges of dashed lines, both of which are under a @ Light and heavy dashed lines subheading. Suggestion: more distinctive subheadings: @ Triple and quadruple dashed lines 2504 BOX DRAWINGS LIGHT TRIPLE DASH HORIZONTAL … @ Double dashed lines 254C BOX DRAWINGS LIGHT DOUBLE DASH HORIZONTAL …
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Latin Extended-B (0180..024F) The first subheading: @ Non-European and historic Latin is no longer accurate since this range includes: 01B7 LATIN CAPITAL LETTER EZH that is used in: * African, Skolt Sami hence in Europe, too. Anyway, classifying letters as “non‐European” is europocentric. I’d suggest to derive this subheading from the one found below, before U+021C LATIN CAPITAL LETTER YOGH: @ Miscellaneous additions by replacing "additions" with "letters" at blockstart: @ Miscellaneous letters 0180 LATIN SMALL LETTER B WITH STROKE … Further, the subheading @ Phonetic and historic letters found before U+01DD LATIN SMALL LETTER TURNED E is plain wrong, as several letters of this range, including the first one, are used in writing systems of living languages. It’s probably safe to replicate the generic @ Miscellaneous additions subheading.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Latin Extended-C (2C60..2C7F) The first subheading of this block: @ Orthographic Latin additions does not make that much sense, since "Latin" is induced from the block name, and "Orthographic" is somehow obvious. Using generic subheadings like the abovementioned @ Miscellaneous letters seems to be safer than figuring out something special that isn’t really.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ Subheadings starting with "Addition[|s|al]" @ Additions for Slovenian and Croatian 0200 LATIN CAPITAL LETTER A WITH DOUBLE GRAVE … @ Additions for Romanian 0218 LATIN CAPITAL LETTER S WITH COMMA BELOW … @ Additions for Livonian 022A LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON … @ Additions for Uighur 2C67 LATIN CAPITAL LETTER H WITH DESCENDER The Code Charts contain 86 subheadings starting with the substring "Addition": 6 times "Addition", 26 times "Additions", and 54 times "Additional". The advantage is to show that the repertoire was built step by step. The downside is a constant reminder that those languages were supported only from a later stage on. That results in unfounded discrimination that could easily be avoided by simply labelling the ranges by what they contain, i.e. mostly letters. That is a big change, so I’m waiting to know whether the EC is ready. Right now I’m late with submitting these items, so there is no comprehensive list of subheadings to change.
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------
Date/Time: Wed Apr 25 14:37:42 CDT 2018
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: PRI 372 NamedSequencesProv.txt
I’m sorry for several typos having occurred in making up the Named Sequences list submitted on Tue Apr 24 02:12:15 CDT 2018: LATIN CAPITAL LETTER O WITH TILDE AND GRAVE ACCENT;00F5 0300 should be: LATIN CAPITAL LETTER O WITH TILDE AND GRAVE ACCENT;00D5 0300 LATIN CAPITAL LETTER UPSILONK WITH GRAVE ACCENT;01B1 0300 should be: LATIN CAPITAL LETTER UPSILON WITH GRAVE ACCENT;01B1 0300 By this occasion I’d suggest to change the proposed subheading from: # Additions for languages in Togo. to # Latin sequences for languages in Togo.
Disposition: Moot, as the suggestion to add was not accepted.
Thanks, Marcel
Date/Time: Wed Apr 25 17:09:46 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI #372: Denoting ranges in the Code Charts
In the Unicode Standard, including the Core Specification, ranges of code points are denoted using two dots. It is therefore desirable to align the notational conventions used in the Code Charts. In the Code Charts, ranges are noted using three different conventions: 1) two dots (U+002E U+002E) 2) hyphen-minus (U+002D) 3) hyphen-minus surrounded by spaces (U+0020 U+002D U+0020) One is therefore to correct all instances showing either (2) or (3). For convenience, instances containing ranges are listed below. Care has been taken to always include a line containing code point(s), be it a block header, or a name line. Disclaimer: These instances have been retrieved using regexes. Search pattern were (1) | (2) | (3) | U+2013, surrounded by 4 hex digits on either side. Standards references have been discarded. This leaves the risk that unconventionally noted ranges have been overlooked. ====================================================== @@ 13A0 Cherokee 13FF @+ Most lowercase Cherokee syllables are encoded in the Cherokee Supplement block at AB70..ABBF. @@ 1950 Tai Le 197F @+ Note the similarly named but distinct New Tai Lue script encoded at 1980..19DF. @@ 1980 New Tai Lue 19DF @+ Note the similarly named but distinct Tai Le script encoded at 1950..197F. The New Tai Lue script is also known as Xishuangbanna Dai. ====================================================== 0020 SPACE * sometimes considered a control code * other space characters: 2000-200A 003D EQUALS SIGN * other related characters: 2241-2263 005B LEFT SQUARE BRACKET = opening square bracket (1.0) * other bracket characters: 27E6-27EB, 2983-2998, 3008-301B 00A4 CURRENCY SIGN * other currency symbol characters: 20A0-20BF 00B2 SUPERSCRIPT TWO = squared * other superscript digit characters: 2070-2079 00B8 CEDILLA * this is a spacing character * other spacing accent characters: 02D8-02DB @ Vulgar fractions @+ The fraction bar for these may be rendered horizontally or at a slant. For other fraction characters, see 2150-215E. 00BC VULGAR FRACTION ONE QUARTER @ Arabic-Indic digits @+ These digits are used with Arabic proper; for languages of Iran, Afghanistan, Pakistan, and India, see the Eastern Arabic-Indic digits at 06F0-06F9. 0660 ARABIC-INDIC DIGIT ZERO @ Astrological digits @+ These digits, also known as Sinhala Lith Illakkam, have been used primarily for writing horoscopes. This number system has a zero place holder concept, unlike the Sinhala archaic numbers, Sinhala Illakkam, encoded in the range 111E1-111F4. 0DE6 SINHALA LITH DIGIT ZERO @ Punctuation @+ Additional birgas are encoded in the Mongolian Supplement block at 11660-1167F. 1800 MONGOLIAN BIRGA @ Angles @+ Other angle symbols are found at 299B-29AF. 221F RIGHT ANGLE @ Zodiacal symbols @+ See also Asian zodiacal symbols among the animal symbols in the range 1F400-1F418. 2648 ARIES @ Cantillation marks (svara) for the Samaveda @+ See the similar set of Grantha svara markers for the Samaveda, encoded in the range 11366-11374. A8E0 COMBINING DEVANAGARI DIGIT ZERO @@ FB50 Arabic Presentation Forms-A FDFF @+ Preferred characters are found in the Arabic block 0600-06FF. This block also contains 32 noncharacters in the range FDD0-FDEF. @ Fullwidth ASCII variants @+ See ASCII 0020-007E FF01 FULLWIDTH EXCLAMATION MARK @@ 111E0 Sinhala Archaic Numbers 111FF @+ This number system is also known as Sinhala Illakkam. This number system does not have a zero place holder concept, unlike the Sinhala astrological numbers, Sinhala Lith Illakkam, encoded in the range 0DE6-0DEF. @ Cantillation marks (svara) for the Samaveda @+ See the similar set of Devanagari svara markers for the Samaveda, encoded in the range A8E0-A8F1. 11366 COMBINING GRANTHA DIGIT ZERO @ Circled sans-serif digits @+ These digits complement the sans-serif digit sets in the Dingbat block ranges 2780-2789 and 278A-2793. 1F10B DINGBAT CIRCLED SANS-SERIF DIGIT ZERO @ White circles @+ Adjective refers to the thickness of the ring. @+ Constitute a set as follows: 25CB, 2B58, 1F785-1F789 1F785 MEDIUM BOLD WHITE CIRCLE @ White squares @+ Constitute a set as follows: 25A1, 1F78E-1F793 1F78E LIGHT WHITE SQUARE @ Six pointed stars @+ Constitute a set as follows: 2736, 1F7CB-1F7CD 1F7CB MEDIUM SIX POINTED BLACK STAR @ Eight pointed stars @+ Constitute a set as follows: 2735, 1F7CE-1F7D1 1F7CE MEDIUM EIGHT POINTED BLACK STAR ====================================================== @@ FE70 Arabic Presentation Forms-B FEFF @+ Preferred characters are found in the Arabic block 0600 - 06FF. Some of these characters are used for Arabic mathematics where contextual shape variations are important semantically. @ Halfwidth CJK punctuation @+ See CJK punctuation 3000 - 303F FF61 HALFWIDTH IDEOGRAPHIC FULL STOP @ Halfwidth Katakana variants @+ See Katakana 30A0 - 30FF FF65 HALFWIDTH KATAKANA MIDDLE DOT @ Halfwidth Hangul variants @+ See Hangul Compatibility Jamo 3130 - 318F FFA0 HALFWIDTH HANGUL FILLER @ Fullwidth symbol variants @+ See Latin-1 00A0 - 00FF FFE0 FULLWIDTH CENT SIGN
Disposition: Remanded to the editors, to work on making range notation more consistent in future revisions.
======================================================
Date/Time: Thu Apr 26 12:45 CDT 2018
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI #372 (consolidated feedback, remainder)
U+1039 MYANMAR SIGN VIRAMA U+103A MYANMAR SIGN ASAT About 60 % of the viramas have a dedicated subheading: @ Virama It would be desirable that this feature be extended to the remaining 40 %. Multiple issues concerning the Myanmar virama make this a handy example. The Myanmar viramas are encoded between a range of dependent vowels and a range of dependent consonants. (Specifically about these, please see another feedback item above pertaining to the way of declaring dependent vowels and consonants.) I see a potential to enhance presentation by adding an appropriate subheading. Comment lines seem also to need a revision: U+1039 appears in the Code Chart as never rendered visibly. That contradicts the actual annotation, which is suspected to have to be changed to “shape shown is arbitrary and is not visibly rendered” as found at U+17D2, U+2D7F, and U+10A3F. @ Various signs 1036 MYANMAR SIGN ANUSVARA 1037 MYANMAR SIGN DOT BELOW = aukmyit * a tone mark 1038 MYANMAR SIGN VISARGA @ Viramas # added 1039 MYANMAR SIGN VIRAMA = killer (when rendered visibly) # discard * shape shown is arbitrary and is not visibly rendered # replicated 103A MYANMAR SIGN ASAT = killer (always rendered visibly) # discard * always rendered visibly # converted
Disposition: Subhead was adjusted.
------------------------------------------------------------------ U+07F7 NKO SYMBOL GBAKURUNEN This has Gc=So and is nevertheless part of a range under the subhead “Punctuation.” Indeed it is reported to terminate important sections. Therefore it should have been given the Gc=Po, like the famous Siddham section marks, that are all Gc=Po. Unicode may wish to either change category of U+07F7 from Gc=So to Gc=Po, or to move U+07F7 one line up, so it gets under the preceding “Symbols” subheading: @ Symbols # changed to plural 07F6 NKO SYMBOL OO DENNEN 07F7 NKO SYMBOL GBAKURUNEN # raised by one line @ Punctuation 07F8 NKO COMMA 07F9 NKO EXCLAMATION MARK It appears further useful to give some hint about the meaning of each one of the two symbols, as usual in the Code Charts (cf. other instances of particular symbols and logograms). According to the encoding proposal: http://www.unicode.org/L2/L2004/04172-n2765-nko.pdf page 6, Unicode could add the following comment lines: 07F6 NKO SYMBOL OO DENNEN * remote future placement of the topic # added 07F7 NKO SYMBOL GBAKURUNEN * end of major section # added * time to prepare and have meal # added (optional)
Disposition: Reviewed and declined by the editors, although some adjustment to annotations was made.
------------------------------------------------------------------ U+27C5 LEFT S-SHAPED BAG DELIMITER U+27C6 RIGHT S-SHAPED BAG DELIMITER U+27CB MATHEMATICAL RISING DIAGONAL U+27CD MATHEMATICAL FALLING DIAGONAL The bag delimiters are Gc=Ps and Gc=Pe, and do need a subheading, the more as nearly every single symbol around here has been given its own, while all Gc=Sm. Also, the mathematical diagonals constitute a singleton range each one, yet have generic subheadings. These are advantageously replaced with specific ones. That would contribute to get them visually associated, given they are separated by another character, like most of the paired ASCII punctuations. Abridged snippet: … @@ 27C0 Miscellaneous Mathematical Symbols-A 27EF @ Miscellaneous symbols 27C0 THREE DIMENSIONAL ANGLE 27C1 WHITE TRIANGLE CONTAINING SMALL WHITE TRIANGLE 27C2 PERPENDICULAR 27C3 OPEN SUBSET 27C4 OPEN SUPERSET @ Paired punctuations # added (note the plural) 27C5 LEFT S-SHAPED BAG DELIMITER 27C6 RIGHT S-SHAPED BAG DELIMITER @ Miscellaneous symbols # replicated from blockstart 27C7 OR WITH DOT INSIDE 27C8 REVERSE SOLIDUS PRECEDING SUBSET 27C9 SUPERSET PRECEDING SOLIDUS @ Vertical line operator 27CA VERTICAL BAR WITH HORIZONTAL STROKE @ Miscellaneous symbol # discard @ Mathematical diagonal # changed 27CB MATHEMATICAL RISING DIAGONAL @ Division operator 27CC LONG DIVISION @ Miscellaneous symbol # discard @ Mathematical diagonal # changed 27CD MATHEMATICAL FALLING DIAGONAL @ Operators 27CE SQUARED LOGICAL AND 27CF SQUARED LOGICAL OR @ Miscellaneous symbol 27D0 WHITE DIAMOND WITH CENTRED DOT @ Operators
Disposition: Reviewed by the editors. Some of the suggested changes were implemented.
------------------------------------------------------------------ Mathematical Operators (2200..22FF) In this block, one subheading appears to be displaced, and two seem to be missing. Let’s look at these abridged ranges and see how to fix that: @ Operators 22D2 DOUBLE INTERSECTION 22D3 DOUBLE UNION @ Relations # remove from here
Disposition: Reviewed and declined by the editors.
22D4 PITCHFORK # this should be part of Operators range = proper intersection @ Arithmetic relations # moved and reworded 22D5 EQUAL AND PARALLEL TO 22D6 LESS-THAN WITH DOT 22D7 GREATER-THAN WITH DOT 22D8 VERY MUCH LESS-THAN 22D9 VERY MUCH GREATER-THAN 22DA LESS-THAN EQUAL TO OR GREATER-THAN 22DB GREATER-THAN EQUAL TO OR LESS-THAN 22DC EQUAL TO OR LESS-THAN 22DD EQUAL TO OR GREATER-THAN 22DE EQUAL TO OR PRECEDES 22DF EQUAL TO OR SUCCEEDS 22E0 DOES NOT PRECEDE OR EQUAL 22E1 DOES NOT SUCCEED OR EQUAL 22E2 NOT SQUARE IMAGE OF OR EQUAL TO 22E3 NOT SQUARE ORIGINAL OF OR EQUAL TO 22E4 SQUARE IMAGE OF OR NOT EQUAL TO 22E5 SQUARE ORIGINAL OF OR NOT EQUAL TO 22E6 LESS-THAN BUT NOT EQUIVALENT TO 22E7 GREATER-THAN BUT NOT EQUIVALENT TO 22E8 PRECEDES BUT NOT EQUIVALENT TO 22E9 SUCCEEDS BUT NOT EQUIVALENT TO 22EA NOT NORMAL SUBGROUP OF 22EB DOES NOT CONTAIN AS NORMAL SUBGROUP 22EC NOT NORMAL SUBGROUP OF OR EQUAL TO 22ED DOES NOT CONTAIN AS NORMAL SUBGROUP OR EQUAL @ Ellipses # added @+ These four ellipses are used for matrix row/column elision. # converted from comment line below
Disposition: Reviewed by the editors. Annotations were adjusted.
22EE VERTICAL ELLIPSIS * these four ellipses are used for matrix row/column elision # discard from this place 22EF MIDLINE HORIZONTAL ELLIPSIS 22F0 UP RIGHT DIAGONAL ELLIPSIS 22F1 DOWN RIGHT DIAGONAL ELLIPSIS @ Set relations # replicated and reworded 22F2 ELEMENT OF WITH LONG HORIZONTAL STROKE 22F3 ELEMENT OF WITH VERTICAL BAR AT END OF HORIZONTAL STROKE 22F4 SMALL ELEMENT OF WITH VERTICAL BAR AT END OF HORIZONTAL STROKE 22F5 ELEMENT OF WITH DOT ABOVE 22F6 ELEMENT OF WITH OVERBAR 22F7 SMALL ELEMENT OF WITH OVERBAR 22F8 ELEMENT OF WITH UNDERBAR 22F9 ELEMENT OF WITH TWO HORIZONTAL STROKES 22FA CONTAINS WITH LONG HORIZONTAL STROKE 22FB CONTAINS WITH VERTICAL BAR AT END OF HORIZONTAL STROKE 22FC SMALL CONTAINS WITH VERTICAL BAR AT END OF HORIZONTAL STROKE 22FD CONTAINS WITH OVERBAR 22FE SMALL CONTAINS WITH OVERBAR 22FF Z NOTATION BAG MEMBERSHIP @~ Standardized Variation Sequences ------------------------------------------------------------------ U+2A53 DOUBLE LOGICAL AND U+2A54 DOUBLE LOGICAL OR One word is missing in these names: NESTED, as in: U+2AA1 DOUBLE NESTED LESS-THAN U+2AA2 DOUBLE NESTED GREATER-THAN U+2AA3 DOUBLE NESTED LESS-THAN WITH UNDERBAR That brings the need to add either aliases (recommended) or comment lines: 2A53 DOUBLE LOGICAL AND = double nested logical and 2A54 DOUBLE LOGICAL OR = double nested logical or 2A53 DOUBLE LOGICAL AND * nested 2A54 DOUBLE LOGICAL OR * nested
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+1426 CANADIAN SYLLABICS FINAL DOUBLE SHORT VERTICAL STROKES This character name is misspelt as of the final S in STROKES. An annotation should therefore be added to prevent inadvertent translators from applying plural. Suggestions: 1426 CANADIAN SYLLABICS FINAL DOUBLE SHORT VERTICAL STROKES * one stroke # added (option 1) * one double stroke # added (option 2) * actually one stroke # added (option 3) * actually one double stroke # added (option 4)
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+26A2 DOUBLED FEMALE SIGN U+26A3 DOUBLED MALE SIGN The informative aliases of these symbols should be choosen from the same terminological pool. I.e. when one alias is constructed with generic terminology, the other must be, too. Here, the specific vocabulary used to choose the “lesbianism” alias was unavailable when looking for its male counterpart, since “gayism” is still uncommon; see: https://forum.wordreference.com/threads/is-gayism-a-word-or-not.3067518/ Further, conventional ordering of alias and comment lines should be applied. @ Gender symbols 26A2 DOUBLED FEMALE SIGN = lesbianism # discard = female homosexuality # derived from below x (two women holding hands - 1F46D) 26A3 DOUBLED MALE SIGN = male homosexuality # raised * a glyph variant has the two circles on the same line x (two men holding hands - 1F46C) 26A4 INTERLOCKED FEMALE AND MALE SIGN = bisexuality # raised * a glyph variant has the two circles on the same line
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+2A70 APPROXIMATELY EQUAL OR EQUAL TO Surprisingly this symbol is not composed with an APPROXIMATELY EQUAL sign, but with an ALMOST EQUAL sign. Hence I’d suggest adding an alias and an xref: 2A6F ALMOST EQUAL TO WITH CIRCUMFLEX ACCENT 2A70 APPROXIMATELY EQUAL OR EQUAL TO = almost equal to above equals sign x (approximately equal to - 2245) 2A71 EQUALS SIGN ABOVE PLUS SIGN
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+2044 FRACTION SLASH As a mathematical symbol amidst punctuations (U+2044 has Gc=Sm), this character needs a subheading. Also, the next two could use a subheading, too. 2043 HYPHEN BULLET x (hyphen-minus - 002D) @ Mathematical symbol # added 2044 FRACTION SLASH = solidus (in typography) * for composing arbitrary fractions x (solidus - 002F) x (division slash - 2215) @ Paired punctuation # added 2045 LEFT SQUARE BRACKET WITH QUILL 2046 RIGHT SQUARE BRACKET WITH QUILL @ Double punctuation for vertical text
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+276E HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT U+276F HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT These are Gc=Ps and Gc=Pe respectively. That is wrong. Change requests: 1) Change to Gc=Pi and Gc=Pf, respectively. 2) Remove from BidiBrackets.txt (and change related property values).
Disposition: The UTC declined to make those changes.
3) Add appropriate subheadings in the Code Charts: @ Ornamental brackets 2768 MEDIUM LEFT PARENTHESIS ORNAMENT x (left parenthesis - 0028) 2769 MEDIUM RIGHT PARENTHESIS ORNAMENT x (right parenthesis - 0029) 276A MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT 276B MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT 276C MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT x (left-pointing angle bracket - 2329) 276D MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT x (right-pointing angle bracket - 232A) @ Ornamental quotation marks # added 276E HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT x (single left-pointing angle quotation mark - 2039) 276F HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT x (single right-pointing angle quotation mark - 203A) @ Ornamental brackets # replicated 2770 HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT 2771 HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT 2772 LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT x (left tortoise shell bracket - 3014) 2773 LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT x (right tortoise shell bracket - 3015) 2774 MEDIUM LEFT CURLY BRACKET ORNAMENT x (left curly bracket - 007B) 2775 MEDIUM RIGHT CURLY BRACKET ORNAMENT x (right curly bracket - 007D) @ Dingbat circled digits
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ U+2E08 DOTTED TRANSPOSITION MARKER I wonder whether this should not be Bidi_Mirrored=Yes, by RTL glyph, given U+2E09 LEFT TRANSPOSITION BRACKET and U+2E0A RIGHT TRANSPOSITION BRACKET are so by glyph exchange. (We note that the dotted version U+2E08 occurs unpaired, because the single word it pertains to is moved as specified.)
Disposition: Noted.
------------------------------------------------------------------ Latvian letters for pre-1921 orthography I wonder whether this subheading of range U+A7A0..U+A7A9 should not be “Letters for Latvian pre-1921 orthography”
Disposition: Reviewed and declined by the editors.
------------------------------------------------------------------ CJK Unified Ideographs Extension blocks The names of these blocks need a hyphen before the numbering: @@ 3400 CJK Unified Ideographs Extension-A 4DB5 @@ 20000 CJK Unified Ideographs Extension-B 2A6D6 @@ 2A700 CJK Unified Ideographs Extension-C 2B734 @@ 2B740 CJK Unified Ideographs Extension-D 2B81D @@ 2B820 CJK Unified Ideographs Extension-E 2CEA1 @@ 2CEB0 CJK Unified Ideographs Extension-F 2EBE0 I doubt whether block names stability would allow these corrections, though. Anyhow, I’m interested in hints about why the hyphen rule was applied to blocks like Latin Extended, but not to CJK Extensions. Also it would be good to know if there is a real mistake or not, and if localized versions of the Code Charts should apply the hyphen rule throughout, or like in the English version, or not at all.
Disposition: Noted. The UTC doesn't make policy as to how character names or block names would be translated.
------------------------------------------------------------------ U+29A6 OBLIQUE ANGLE OPENING UP U+29A7 OBLIQUE ANGLE OPENING DOWN Should be OBTUSE ANGLE. So we need informative aliases: 29A6 OBLIQUE ANGLE OPENING UP = obtuse angle opening up 29A7 OBLIQUE ANGLE OPENING DOWN = obtuse angle opening down
Disposition: Reviewed and declined by the editors.