The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of April 30, 2015, since the previous cumulative document was issued prior to UTC #142 (February 2015). Grayed-out items in the Table of Contents do not have feedback here.
The links below go directly to open PRIs and to feedback documents for them, as of July 21, 2015. Gray rows have no feedback to date.
The links below go to locations in this document for feedback.
Date/Time: Fri Jul 24 17:40:06 CDT 2015
Name: Markus Scherer
Report Type: Other Question, Problem, or Feedback
Opt Subject: SignWriting collation, Ken's L2/15-202
Regarding http://www.unicode.org/L2/L2015/15202-signwriting-ducet-aux.txt I would like to note that Ken's analysis suggests that the fills and rotations should work properly if assigned "trailing primary weights". http://www.unicode.org/reports/tr10/#DUCET_Order_Table (Table 13. DUCET Ordering) http://www.unicode.org/reports/tr10/#Trailing_Weights (7.1.4 Trailing Weights) I don't know whether it is feasible to assign these characters such trailing weights in the DUCET. We do use the "trailing" primary weight FFFD for U+FFFD. Trailing weights can be tailored with CLDR/ICU syntax. http://www.unicode.org/reports/tr35/tr35-collation.html#Logical_Reset_Positions (3.11 Logical Reset Positions)
Date/Time: Mon Jul 27 19:00:39 CDT 2015
Name: Garth Wallace
Report Type: Feedback on an Encoding Proposal
Opt Subject: Hentaigana and the Kana Supplement block
The recent hentaigana proposal (L2/15-193) requests that they be encoded as Standardized Variation Sequences of hiragana. This seems like a good idea, since fallback in the absence of font support would be to the standard hiragana, so the results would still be readable. But where does that leave the Kana Supplement block? That block contains only two encoded characters, but was allocated 256 code points, presumably for the future encoding of hentaigana. With hentaigana handled by SVSes, it seems unlikely that many of those points would ever get filled. I realize there's no shortage of code points in the UCS, but still. One thing I noticed: the hentaigana proposal contains a duplicate of an existing character. MJ090014 (え variant with mother ideograph 江) looks like it's already encoded in the Kana Supplement block as U+1B001 HIRAGANA LETTER ARCHAIC YE.
(No feedback at this time in this section.)
Date/Time: Thu May 14 17:41:18 CDT 2015
Name: Ken Lunde
Report Type: Error Report
Opt Subject: kKorean versus kHangul (UAX #38)
1) The kHangul field is currently covering the characters that correspond to the KS X 1001 (4,888) and KS X 1002 (2,856) standards, and only nine entries need to be adjusted, as follows, to make this alignment correct an up-to-date: Changes: U+6635 kHangul 닐 U+66B1 kHangul 닐 U+8D05 kHangul 췌 U+96B8 kHangul 례 Additional field value: U+90DE kHangul 낭 랑 U+96B7 kHangul 례 예 Removals: U+90CE kHangul 낭 Additions: U+FA2E kHangul 낭 U+FA2F kHangul 예 2) The kHangul field currently specifies that one or more instances of U+1100 through U+11FF be used for each value. In reality, it should be two or three instances. I suggest the following regex: [\x{1100}-\x{11FF}]{2}[\x{1100}-\x{11FF}]? But, because these sequences normalize (via NFC) to characters in the range U+AC00 through U+D7A3, I recommend that they be changed accordingly, which will result in greater stability and greater compaction (one character instead of two or three). In addition to changing the data itself from two or three instances of U+1100 through U+11FF to one instance of U+AC00 through U+D7A3, the regex in UAX #38 needs to be changed to the following: [\x{AC00}-\x{D7A3}] 3) I recommend that the status of the kKorean field be changed from Provisional to Deprecated, and that the use of kHangul be recommend for Korean readings.
Date/Time: Thu Jun 11 00:18:56 CDT 2015
Name: Sebastian Mayr
Report Type: Error Report
Opt Subject: Conformance Section in UTS46 is confusing
NOTE: Sent to Mark Davis and Editorial Committee already, and acknowledged receipt to user.
The Format section (8.1) under Conformance Testing in UTS46 is confusing. The explanation for the toASCII and toUnicode explains to use the provided processing_option for toUnicode, and always use nontransitional for toASCII. However, in the implementation section of toUnicode (4.3), it explains to always call the processing step with nontransitional. The toASCII parameter list provides a processing_option, though. It looks to me, as if the descriptions for toASCII and toUnicode in the conformance testing section got mixed up. This also applies to the descriptions in the header of IdnaTest.txt.
Date/Time: Sun Jul 12 11:29:03 CDT 2015
Name: Laurentiu Iancu
Report Type: Error Report
Opt Subject: Missing copyright / terms of use statements in the security and UCA data files
Unlike the UCD files, several UTS #39 and UTS #10 data files (in Public/security/latest/ and Public/UCA/latest/) are missing copyright and terms of use statements. The affected files are the following: Public/security/latest/ All of the files in that directory Public/UCA/latest/ CollationTest.html All of the files inside CollationTest.zip This issue was raised and discussed briefly during the release of Unicode 8.0 (BRS item #111). The conclusion there was that it should be a priority to fix for the next releases.
Date/Time: Sun Jul 12 11:31:17 CDT 2015
Name: Laurentiu Iancu
Report Type: Error Report
Opt Subject: Missing # EOF lines in idna, security, and UCA data files
All of the UCD files end with a # EOF line. Several UTS #46, #39, and #10 data files (in Public/idna/latest/, Public/security/latest/, and Public/UCA/latest/) do not have such lines. Specifically, the files with missing # EOF lines are the following: Public/idna/latest/ IdnaMappingTable.txt IdnaTest.txt Public/security/latest/ All of the files in that directory (ReadMe.txt is N/A) Public/UCA/latest/ allkeys.txt decomps.txt All of the files inside CollationTest.zip This issue was also discussed briefly during the release of Unicode 8.0 (in relation to BRS item #111). However, compared to the issue of missing copyright and terms of use statements, reported separately, the absence of # EOF lines does not seem to constitute a priority. It ought to be examined by the UTC, though, to decide whether updating all of the tools that generate the data (including test) files listed above is a worthy investment, to make them consistent with the UCD files in terms of # EOF lines.
Date/Time: Tue Jul 28 16:52:40 CDT 2015
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: Error in Full Emoji Data chart for Android glyphs
The Full Emoji Data chart at http://www.unicode.org/emoji/charts/full-emoji-list.html includes some flag images in the Android column that actually do not exist on any version of Android software. Here is the list of flags that should be removed from that chart: * flag for Antarctica * flag for St. Barthélemy * flag for Guadeloupe * flag for Heard & McDonald Islands * flag for St. Martin * flag for Martinique * flag for St. Pierre & Miquelon * flag for Réunion * flag for Svalbard & Jan Mayen * flag for Wallis & Futuna * flag for Mayotte * flag for French Guiana * flag for New Caledonia * flag for Caribbean Netherlands * flag for St. Helena * flag for U.S. Outlying Islands * flag for Western Sahara * flag for Falkland Islands * flag for South Georgia & South Sandwich Islands * flag for French Southern Territories * flag for Clipperton Island * flag for Diego Garcia * flag for Ceuta & Melilla * flag for Canary Islands * flag for Tristan da Cunha
Date/Time: Thu Jun 11 07:38:01 CDT 2015
Name: Ken Lunde
Report Type: Other Question, Problem, or Feedback
Opt Subject: Proposed alias or annotation for U+1F52B PISTOL
Note: This has already been sent to Emoji Subcommittee.
Because some people and organizations make a distinction between pistol and handgun, with the former being of the semi-automatic variety, and the latter being an umbrella term that covers pistols and revolvers, and because some implementations of U+1F52B PISTOL use an image of a revolver, I propose that the alias or annotation, 'handgun', be added to this character, and to consider 'revolver' as a second alias or annotation.
Date/Time: Wed Jun 17 19:43:38 CDT 2015
Name: Richard Gillam
Report Type: Error Report
Opt Subject: Word-break handling of fullwidth digits
My application's word-counting code is based on the ICU word-break iterator (UBRK_WORD), and it's getting wrong results with CJK text that includes fullwidth digits. Any numbers written with fullwidth digits aren't getting counted as numbers by ICU-- instead, the individual digits get treated the same as punctuation or whitespace. I looked in http://www.unicode.org/Public/UCD/latest/ucd/auxiliary/WordBreakProperty.txt, and I notice that the fullwidth digits are not mentioned in this file at all-- shouldn't they be given the "Numeric" property, like all the other digits? They do have the Nd general category, like all the other digits. Unlike the other digits, they have the ID line-break property, but this shouldn't matter. Tell me what I'm missing here...
Date/Time: Tue Jun 30 05:13:23 CDT 2015
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Error in the Core specifications
IMHO the Core Specificationsʼ 23.2 Word Joiner first sentence is wrong in that it generalizes from the absence of line break opportunities to the absence of word boundaries. In practice, the word boundaries behavior of U+FEFF and U+00A0 is the opposite, they indicate a word boundary. From this I extrapolate to U+2060, which is not a part of any font shipped with Windows 7 and therefore I canʼt test. At the end of the next paragraph, Unicode recommends to ignore the word joiner whenever the issue is not word breaking or line breaking. As far as belongs to the ZWNBSP, this character is not ignored when word boundaries are determined. E.g., when the letter apostrophe is bracketed with U+FEFFs, it behaves like a punctuation apostrophe. This makes the word joiners even more useful. Please let me know if Unicode can make sense and use of the above for the on- going TUS overhaul without any discussion of this issue to be launched on the Mailing List. If not, Iʼm ready to mail the topic, or to mention it in the current one (WORD JOINER vs ZWNBSP). However, I donʼt want to reinforce my probable reputation of someone who loves criticising other peopleʼs work. Best regards, Marcel Schneider PS: Dimly I would suggest that one may wish to add a plural s on the front page of the Core Specs.