This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Sun Jan 07 09:10:23 CST 2024
ReportID: ID20240107091023
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: 494
Currently it is stated in table 2 that U+16D6A KIRAT RAI VOWEL SIGN AU (together with two other characters) will be added to Grapheme_Cluster_Break=V. However, instead of AU it should be U+16D69 KIRAT RAI VOWEL SIGN O because AU decomposes into O+E, while AU itself does not appear in the decomposition of any other character.
Feedback above this line reviewed during UTC #178 in January 2024.
Date/Time: Mon Apr 22 11:41:57 CDT 2024
ReportID: ID20240422114157
Name: Jules Bertholet
Report Type: Error Report
Opt Subject: PropList.txt
UAX 29 (http://unicode.org/reports/tr29/) says the following: > The default rules have been written so that they can be applied directly > to non-NFD text and yield equivalent results [versus applying to NFD text]. In support of this aim, it later says the following about legacy grapheme clusters: > The continuing characters include nonspacing marks, the Join_Controls > (U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER) used in Indic > languages, and a few spacing combining marks to ensure canonical equivalence. However, this property (that grapheme cluster boundaries are closed under canonical equivalence) currently does not hold. U+0CC0 KANNADA VOWEL SIGN II has `Grapheme_Cluster_Break=SpacingMark`, but it NFD decomposes to two characters (U+0CBF KANNADA VOWEL SIGN I and U+0CD5 KANNADA LENGTH MARK) which both have `Grapheme_Cluster_Break=Extend`. To correct this error, U+0CC0 should be given the property `Other_Grapheme_Extend` in `PropList.txt`.
Date/Time: Mon Apr 22 12:01:50 CDT 2024
ReportID: ID20240422120150
Name: Jules Bertholet
Report Type: Error Report
Opt Subject: PropList.txt
# Amending my previous report A few moments ago, I submitted an error report about the `Grapheme_Cluster_Break` property of U+0CC0. I would like to amend this report to note the following other characters which are also affected: - U+0CC7 - U+0CC8 - U+0CCA - U+0CCB - U+1B3B - U+1B3D - U+1B43
Date/Time: Thu May 09 10:36:08 CDT 2024
ReportID: ID20240509103608
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: 494
Action item 179-A113 says to categorize semicolons as Sentence_Break = SContinue. https://github.com/unicode-org/unicodetools/pull/812 modifies U+1364 ETHIOPIC SEMICOLON, U+A6F6 BAMUM SEMICOLON, and U+1DA89 SIGNWRITING SEMICOLON accordingly. Those three scripts also have commas and colons, which still have Sentence_Break = Other. If those scripts’ semicolons are recategorized to match ASCII, so should their commas and colons; if there is not yet any evidence supporting changing their commas and colons, there probably isn’t any for their semicolons either, so their semicolons should not be recategorized. Lisu, Medefaidrin, Mongolian, Newa, and Vai don’t have semicolons, but they do have commas or colons. Should those be recategorized too? I wouldn’t assume so just from their character names, but maybe.