Public Review Issues

Accumulated Feedback on PRI #285

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Wed Sep 24 20:35:10 CDT 2014
Name: Peter Occil
Report Type: Public Review Issue
Opt Subject: Public Review Issue 285: Unicode Collation Algorithm

Ed Note: Please see L2/14-230, a complete specification for proposed Appendix A changes.

I think this is a good time to revise Appendix A, Deterministic Sorting.  Here
are my suggested revisions. This only affects the first paragraph of A to the
end of A.3.3.  I hope my revisions better highlight the best practices that
this appendix is trying to express.

--Peter

Date/Time: Fri Nov 21 01:40:50 CST 2014
Name: Sergiusz Wolicki
Report Type: Error Report
Opt Subject: Error in UTS #10

Dear Sirs,

UTS #10 contains the following text in section 8:

---
DS1a. [...]

The tailoring parameter match-boundaries specifies constraints on matching 
(see Section 5.1, Parametric Tailoring). The parameter match-boundaries=whole-character 
requires that the start and end of a match each be on a grapheme boundary. 
The value match-boundaries=whole-character further requires that the start and end 
of a match each be on a word boundary as well. [...]
---

It seems to me that the second occurrence of "match-boundaries=whole-character" 
should be "match-boundaries=whole-word".

Thanks and best regards,
Sergiusz Wolicki

Feedback above this line was accommodated in the draft of 2014-12-03.

Date/Time: Sun Jan 25 02:11:16 CST 2015
Name: SADAHIRO Tomoyuki
Report Type: Public Review Issue
Opt Subject: Public Review Issue 285: Unicode Collation Algorithm

In the revision of S2.1.2, "ccc(B) >= ccc(C)" seems to
correspond with "the same canonical combining class" (current).

Is not "ccc(B) = ccc(C)" enough ?

The case where "ccc(B)  > ccc(C)" and there is no starter (ccc=0)
between S and C, "ccc(B)  > ccc(C)" violates canonical ordering
and must not appear in any normalized form.

If canonical ordering is not applied to the input string,
I think B of "ccc(B)  > ccc(C)" should not block S + C.
In such a case, canonical ordering of S+B+C is S+C+B,
and then B will not block C.
If A has lower ccc than B and C [ccc(A) < ccc(C) < ccc(B)],
canonical ordering of S+A+B+C is S+A+C+B,
and still B will not block C.
The above cases S+C in S+B+C or S+A+B+C can be a valid contraction.

Feedback above this line was reviewed at the February 2015 UTC meeting.

Date/Time: Wed Feb 25 14:32:20 CST 2015
Name: Steve Slevinski
Report Type: Public Review Issue
Opt Subject: Public Review Issue 285: Unicode Collation Algorithm: Sorting SignWriting symbols

My apologies if this is the wrong place to submit feedback.

I am concerned that the sorting of SignWriting symbols has not been properly addressed in the document 
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4342.pdf .  I believe a few additions to DUCET will solve the issue.

Specifically, here is a list of four symbols:
1) U+1D800
2) U+1D800 U+1DAA1
3) U+1D800 U+1DA9B
4) U+1D800 U+1DA9B U+1DAA1

The symbols in the above list are in the correct sort order; however, a binary string compare will 
incorrectly sort the symbols as 1, 3, 4, 2.

I believe the sorting issue could be resolved by additions to the DUCET so that the Rotation modifiers 
(U+1DAA1 - U+1DAAF) are sorted before the Fill modifiers (U+1DA9B - U+1DA9F).

Regards,
-Steve