This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Sun Mar 21 16:48:51 CDT 2021
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #427: Misuse of subscripts
Note: This feedback has been reviewed and changes are reflected in revision 22, draft 3.
The proposed update introduces the notations ∁ₛ and ∁ₚ for the complements of 𝕊 and ℙ, using subscript plain lowercase letters in place of subscript double-struck capital letters. This use of U+209B and U+209A goes against Unicode’s general principle of subscripts, as described in section 22.4, that “style or markup in rich text” is preferred when possible, except in phonetic alphabets. Because UTS #18 is written in HTML, it should use `∁<sub>𝕊</sub>` and `∁<sub>ℙ</sub>`.
Date/Time: Wed Jun 16 23:40:43 CDT 2021
Name: Wang Yifan
Report Type: Error Report
Opt Subject: PRI #427: Examples out of line with UCD
This feedback has been directed to the UTC and there are actions to make changes.
Some examples currently given in UTS #18 seem to have been either wrong or outdated. In the table showing expressions related to hiragana under Section 1.2.6: Expression | Contents of Set \p{sc=Hira} | [ぁ-ゖゝ-ゟ𛀁🈀] \p{scx=Hira} | [、-〃〆〈-】〓-〟〰-〵〷〼-〿ぁ-ゖ ゙-゠・ー㆐-㆟㇀-㇣㈠-㉃㊀-㊰㋀-㋋㍘-㍰ ㍻-㍿㏠-㏾﹅﹆。-・ー゙゚𛀁🈀] But neither line reflects the current state of set in U13.0 or the proposed U14.0. Moreover, it contains some unneeded spaces. They should look like (I'm just writing manually; please generate from data files for accuracy): Expression | Contents of Set \p{sc=Hira} | [ぁ-ゖゝ-ゟ𛀁-𛄞𛅐-𛅒🈀] \p{scx=Hira} | [、-〃〈-】〓-〟〰-〵〷〼〽ぁ-ゖ゙-゠・ー﹅﹆。-・ー゙゚𛀁-𛄞𛅐-𛅒🈀] Also, the second (currently first) table under Section 1.1: Syntax | Matches [\u{3040}-\u{309F} \u{30FC}] | Hiragana characters, plus prolonged sound sign The description is not enough accurate as well as misleading as of today. It should say "Hiragana block code points" instead of "Hiragana characters" for maximal accuracy. Though I only spotted issues around Hiragana because it caught sight of me intuitively, there could be more examples needing maintenance.
Date/Time: Thu Dec 16 22:29:51 CST 2021
Name: Karl Williamson
Report Type: Error Report
Opt Subject: UTS 18
U+0F33 TIBETAN DIGIT HALF ZERO has a numeric value of -0.5. (I believe the existence of this character in the wild is apocryphal however.) There is no rule against other code points becoming encoded with a negative value. However, UTS 18 says the hyphen-minus sign is supposed to be ignored within \p{} constructs, leaving no way to legally specify negative values. I suspect that UTS 18 should be clarified to indicate that the hyphen minus at the beginning of a number should not be ignored, even with loose matching. But then what to do about two in a row?
Date/Time: Tue Jan 18 22:34:08 CST 2022
Name: Norbert Lindenberg
Report Type: Error Report
Opt Subject: UTS 18: Unicode Regular Expressions
In a table showing "current examples of escape syntax for Unicode code points", UTS 18: Unicode Regular Expressions shows "\uD83D\uDC7D" in the row that includes JavaScript. This example is no longer current: EcmaScript 2015 introduced the \u{xxxxxx} escape syntax for Unicode code points, so that "\u{1F47D}" now works.
Date/Time: Sat Jan 22 04:46:41 CST 2022
Name: Ivan Panchenko
Report Type: Error Report
Opt Subject: UTS #18
UTS #18 contains the following mistakes: “expressions.The” (instead of “expressions. The”), “Database[UAX44]” (instead of “Database [UAX44]”), “three character” (instead of “three characters”), “"False" In” (instead of “"False". In”), “Name values, must” (instead of “Name values must”), “the the” (instead of “the”), “see see” (instead of “see”), “does offers” (instead of “does offer”), “Equivalents .” (instead of “Equivalents.”), “(see NameAliases.txt).In” (instead of “(see NameAliases.txt). In”), “Properties .” (instead of “Properties.”). An out-of-place closing parenthesis is found here: “[Perl].)” (instead of “[Perl].”). Finally, a closing parenthesis is missing after “COMBINING DIAERESIS”.