Date/Time: Fri Oct 28 17:11:52 CDT 2011
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: IdnaMappingTable.txt minor formatting issue
In IdnaMappingTable.txt, inline comments are usually formatted with a space between the # and the Unicode age value. In the 6.1 version, when a data line has long data values, the space is omitted. Slightly ugly, and gratuitous diffs. For example, 2900..2A0B ; valid ; ; NV8 # 3.2 RIGHTWARDS TWO-HEADED ARROW WITH VERTICAL STROKE..SUMMATION WITH INTEGRAL 2A0C ; mapped ; 222B 222B 222B 222B #3.2 QUADRUPLE INTEGRAL OPERATOR 2A0D..2A73 ; valid ; ; NV8 # 3.2 FINITE PART INTEGRAL..EQUALS SIGN ABOVE TILDE OPERATOR 2A74 ; disallowed_STD3_mapped ; 003A 003A 003D #3.2 DOUBLE COLON EQUAL 2A75 ; disallowed_STD3_mapped ; 003D 003D # 3.2 TWO CONSECUTIVE EQUALS SIGNS 2A76 ; disallowed_STD3_mapped ; 003D 003D 003D #3.2 THREE CONSECUTIVE EQUALS SIGNS 2A77..2ADB ; valid ; ; NV8 # 3.2 EQUALS SIGN WITH TWO DOTS ABOVE AND TWO DOTS BELOW..TRANSVERSAL INTERSECTION
Date/Time: Mon Oct 31 18:48:20 CDT 2011
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: Unicode 6.1 SpecialCasing.txt @missing needs another semicolon
Unicode 6.1 has this default-value line in the SpecialCasing.txt: # @missing: 0000..10FFFF; <slc>; <stc>; <suc> There needs to be another semicolon at the end according to the documentation in the header: # The entries in this file are in the following machine-readable format: # # <code>; <lower> ; <title> ; <upper> ; (<condition_list> ;)? # <comment> so the @missing line should be changed to # @missing: 0000..10FFFF; <slc>; <stc>; <suc>;
Date/Time: Tue Nov 1 15:51:33 CDT 2011
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: UCA 6.1 bug in FractionalUCA.txt
UCA 6.1 has the same primary collation weight for all spaces. In FractionalUCA.txt, that collation weight is a single byte 04. That file also defines top-of-reordering-group primary weights, and the top-of-spaces is 04 FE: FDD0 0042; [04 FE, 05, 05] # Special final value for reordering token This is wrong. The space weight of 04 is a prefix of the top-of-spaces weight, which is forbidden. It also means that no character can be tailored primary-after any space and still reorder with the normal spaces. For a fix, the range of lead bytes for spaces should be restored to 04..05, moving up every following primary weight, and the top-of-spaces weight needs to be restored to 05 FE.
Date/Time: Tue Nov 1 15:57:35 CDT 2011
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: Unicode 6.1 bug in BidiTest.txt
Much of the data in BidiTest.txt has changed, and it appears that the new data is wrong. Many of the results for auto-LTR changed levels. For example, the 6.0 version of BidiTest.txt resolved ES and auto-LTR to level 0 but version 6.1 resolves it to level 1. See lines 93 & 108 of the file from 2011-07-25, 00:53:54 GMT. I have a hunch that maybe the data generation tool actually uses auto-RTL on the bit that is documented as auto-LTR. I recommend reverting this file to the 6.0 version unless there is a better fix.