Public Review Issues

Accumulated Feedback on PRI #279

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Thu Aug 28 12:28:05 CDT 2014
Name: Matitiahu Allouche
Report Type: Public Review Issue
Opt Subject: PRI #279: Proposed Update UAX #9, Unicode Bidirectional Algorithm


In my opinion, the three proposed changes fix imprecise language or oversights in the 
modifications brought to UAX#9 by Unicode 6.3. They reset the text to what it should have 
been originally.
Since the changes affect only very exotic cases, backward compatibility problems should 
be minimal.

Date/Time: Mon Sep 8 14:38:27 CDT 2014
Name: Glenn Adams
Report Type: Error Report
Opt Subject: Inconsistency in UAX #9

Section 3.4 [1] indicates that the following order applies (leaving out some detail):

1. resolve levels
2. shaping
3. line breaking
4. reordering (per line)

In contrast Section 3.5 [2] states:

"Shaping is logically applied after the Bidirectional Algorithm is used..."

The inconsistency is that some read "Bidirectional Algorithm" to include reordering, 
and infer that reordering occurs before shaping. Having implemented this a few times, 
I believe that interpretation is incorrect, and the correct order is indicated by 
Section 3.4.

Perhaps the language in 3.5 needs qualification to prevent this misreading.

[1] http://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels
[2] http://www.unicode.org/reports/tr9/#Shaping

Date/Time: Fri Sep 19 10:40:26 CDT 2014
Name: Andrew Glass
Report Type: Error Report
Opt Subject: Type in UAX #9

In Rule P3 of section 3.3.1 in UAX #9 there is a typo. The text repeats 
"If a character is found in P3"

http://www.unicode.org/reports/tr9/#P3

"P3. If a character is found in P3. If a character is found in P2 and it is 
of type AL or R, then set the paragraph embedding level to one; otherwise, 
set it to zero."

Should be changed to:

"If a character is found in P2 and it is of type AL or R, then set the 
paragraph embedding level to one; otherwise, set it to zero."

Date/Time: Tue Sep 30 12:55:53 CDT 2014
Name: C. E. Whitehead
Report Type: Public Review Issue
Opt Subject: Proposed Update UAX #9

Below are my comments on TR9 through section 3.1.4 (I'll get to the rest I
hope but this is for now; still working on 3.1.2, "Note that", fourth bullet,
however). http://www.unicode.org/reports/tr9/tr9-32.html

3.1.3, BD16:

"BD16. A bracket pair is a pair of characters consisting of an opening paired bracket
and a closing paired bracket such that the Bidi_Paired_Bracket property value of the
former or its canonical equivalent equals the latter or its canonical equivalent and
which are algorithmically identified at specific text positions within an isolating
run sequence. The following algorithm identifies all of the bracket pairs in a given
isolating run sequence:"
{for "and which" are has no real predecessor; what does "which" refer to? Also what
does "algorithmically identified" mean? -- I assume it means, "following the algorithm
below", but it sounds "vague" to me; however it's o.k and if you prefer you can leave
this last part as is}
=>
"BD16. A bracket pair is a pair of characters consisting of an opening paired bracket
and a closing paired bracket such that the Bidi_Paired_Bracket property value of the
former or its canonical equivalent equals the latter or its canonical equivalent. In
addition the pair are identified at specific text positions within an isolating run
sequence, using an algorithm like the following, which identifies all of the bracket
pairs in a given isolating run sequence:"

* * *

Algorithm:

"1. If the values match, meaning the two characters form a bracket pair, then
* Append the text position in the current stack element together with the
text position of the closing paired bracket to the list.
* Pop the stack through the current stack element inclusively.
2. Else, if the current stack element is not at the bottom of the stack,
advance it to the next element deeper in the stack and go back to step 2."

{ I believe that you have left off part of step 1 (if I understand the algorithm):
"if the current stack element is at the bottom of the stack" and if the two characters
form a bracket pair; otherwise you advance the stack to the next element back }
=>

If the current stack element is at the bottom of the stack, and the values match, meaning
the two characters form a bracket pair, then
Append the text position in the current stack element together with the text position
of the closing paired bracket to the list.
Pop the stack through the current stack element inclusively.
Else, if the current stack element is not at the bottom of the stack, advance it to the
next element deeper in the stack and go back to step 2."

Best,

--C. E. Whitehead

Date/Time: Tue Oct 14 19:08:49 CDT 2014
Name: Laurentiu Iancu
Report Type: Public Review Issue
Opt Subject: PRI #279 - Further clarifications in rule N0

A topic posted by Eli Zaretskii on 2014-10-14 on the Unicode mailing list
showed that the text of rule N0 is potentially ambiguous to implementers in
terms of which bidirectional types should be used during the execution of the
for loop in rule N0.  While there is a general theme of operating on the most
current bidirectional types throughout the algorithm unless stated otherwise,
it would still help to be explicit about the updated bidirectional types of
bracket characters that participate in the resolution of bracket pairs in the
loop in rule N0, to avoid problems of misinterpretation of the spec.

In Draft 1 of the Proposed Update for Unicode 8.0, a clarification was added
in the preamble of rule N0 (as well as the statement of definitions BD14 and
BD15) about the current bidirectional types being used for bracket characters.
That text (which was introduced to address an issue with overrides) can be
expanded to indicate that the strong types set during an iteration of the loop
in N0 are taken into account in the resolution of other bracket pairs in
subsequent iterations of the loop.

The editors may also consider pointing out that a paired bracket, once set a
strong bidirectional type, ceases to be treated as a neutral and becomes a
strong type, with the corresponding influence on its neighbors.  This was the
crux of the issue with directional overrides applied to brackets, addressed in
Draft 1 of the Proposed Update, and I think it similarly applies to the
iterative resolution of bracket pairs in N0.

Date/Time: Sat Oct 18 16:26:54 CDT 2014
Name: C. E. Whitehead
Report Type: Public Review Issue
Opt Subject: Proposed Update UAX #9, Unicode Bidirectional Algorithm


I've read through 3.1.2 and all sections till 3.3; also much of 3.3; also section 4.3 
which has a major typo which I hope you are planning to fix; and section 7; sorry that 
I have not read more of the report!

Here are comments on:
http://www.unicode.org/reports/tr9/tr9-32.html#BD11 and forward

3.1.2
BD11 algorithm
"
    Initialize a counter to one.
    Scan the text following the embedding initiator:
        At an isolate initiator, skip past the matching PDI, or if there is no matching PDI, to the end of the paragraph.
        At the end of a paragraph, or at a PDI that matches an isolate initiator before the embedding initiator, stop: 
        	the embedding initiator has no matching PDF.
        At an embedding initiator, increment the counter.
        At a PDF, decrement the counter. If its new value is zero, stop: this is the matching PDF.
"

{COMMENT: a nitpick: in the second bullet you say "at a PDI that matches an isolate initiator before the 
embedding initiator" -- this use of "before" is confusing to me; do you mean that you reach the pdi before 
reaching the embedding initiator? This can't be the case as you are scanning the text following the embedding 
initiatory, but the wording is not right; change to: "that matches an isolating intiator that occurred 
outside/before the/prior to embedding initiator"}

=>

"Initialize a counter to one.
    Scan the text following the embedding initiator:
        At an isolate initiator, skip past the matching PDI, or if there is no matching PDI, to
		the end of the paragraph.
        At the end of a paragraph, or at a PDI that matches an isolate initiator that occurred prior 
		to the embedding initiator, stop: the embedding initiator has no matching PDF.
        At an embedding initiator, increment the counter.
        At a PDF, decrement the counter. If its new value is zero, stop: this is the matching PDF."

* * *

3.3.2 "Explict Embeddings", X2, 1rst par, last bullet

 With each RLE, perform the following steps:
    Otherwise, this is an overflow RLE. If the overflow isolate count is zero, increment the overflow 
embedding count by one. Leave all other variables unchanged.

{COMMENT: INSERT HERE FOR CLARITY=>Otherwise 
this overflow RLE is within the scope of an overflow isolate initiator, so do nothing.}
X3, first par, last bullet

"Otherwise, this is an overflow LRE. If the overflow isolate count is zero, increment the overflow 
embedding count by one. Leave all other variables unchanged. {COMMENT: INSERT HERE FOR CLARITY =>
Otherwise this overflow LRE is within the scope of an overflow isolate initiator, so do nothing.}

{QUESTION: So the embeddings that are done in an overflow isolate are only terminated by the overflow 
isolate terminator, I gather? }

* * *

3.3.2 "Explicit Levels and Directions", "Terminating Isolates", X6A, third bullet, then 2nd sub-bullet:
"While the directional isolate status of the last entry on the stack is false, pop the last entry from 
the directional status stack. (This terminates the scope of those valid embedding initiators within the 
scope of the matched isolate initiator whose scopes have not been terminated by a matching PDF, and which 
thus lack a matching PDF. Given that the valid isolate count is non-zero, the directional status stack must 
ontain an entry with directional isolate status true before this step, and thus after this step the last 
entry on the stack will indeed have a true directional isolate status, i.e. represent the scope of the 
matched isolate initiator. This cannot be the stack's first entry, which always belongs to the paragraph 
level and has a false directional status, so there is at least one more entry before it on the stack.)"

{COMMENT: again, the use of "before"and "after" is confusing; the entry that the "directional isolate status" 
set to "true" was PLACED before this step but I would not say that the stack contains it before this step; 
that is sort of comparing "apples and oranges" -- comparing a directional isolate status entry to a step}

=>

"While the directional isolate status of the last entry on the stack is false, pop the last entry from the 
directional status stack. (This terminates the scope of those valid embedding initiators within the scope 
of the matched isolate initiator whose scopes have not been terminated by a matching PDF, and which thus 
lack a matching PDF. Given that the valid isolate count is non-zero, the directional status stack must 
contain an entry with directional isolate status true; [this entry must have been placed prior to the PDI], 
and thus, once all false entries are popped, the last entry on the stack will indeed have a true directional 
isolate status, i.e. represent the scope of the matched isolate initiator. This cannot be the stack's first 
entry, which always belongs to the paragraph level and has a false directional status, so there is at least 
one more entry before it on the stack.)"

* * *

3.3.5 "Resolving Neutral and Isolate Formatting Types", N0, 2nd bullet, section c
"Otherwise, if there is a strong type it must be opposite the embedding direction. Therefore, test for an 
established context with a preceding strong type by checking backwards before the opening paired bracket 
until the first strong type (L, R, or sos) is found."

{COMMENT: would it be better to say, "by checking backwards within the isolating run in which the bracket 
pair occurs"? Is that what you mean?}

=> ?

"Otherwise, if there is a strong type it must be opposite the embedding direction. Therefore, test for an 
established context with a preceding strong type by checking backwards from the opening paired bracket until 
the first strong type (L, R, or sos) is found. 
If there is no strong type within the isolating run sequence then set the bracket pair to the embedding direction."

* * *

X6A, last bullet, last sub-bullet
"If the entry's directional override status is not neutral, reset the current character type from PDI to L 
if the override status is left-to-right, and to R if the override status is right-to-left."
{Just nitpicking; it's usually clearer to start an "if-then" clause with "if" than it is to start it with 
"then" but you can ignore this suggestion}

=>?

"If the entry's directional override status is not neutral, if the override status is left-to-right, reset 
the current character type from PDI to L; set it to R if the override status is right-to-left."

* * *

I do have more to check (if I get a chance); however a major goof:
4.3 "Higher-level Protocols"
"Certain characters that do not have the Bidi_Mirrored property can also be depicted by a mirrored glyph in 
specialized contexts. Such contexts include, but are not limited to, historic scripts and associated punctuation, 
private-use characters, and characters in mathematical expressions. (See Section 6, Mirroring.) These characters 
are those that fit at least one of the following conditions:"

{COMMENT: you mean "section 7", which is what the link goes to.}

=>

"Certain characters that do not have the Bidi_Mirrored property can also be depicted by a mirrored glyph in 
specialized contexts. Such contexts include, but are not limited to, historic scripts and associated punctuation, 
private-use characters, and characters in mathematical expressions. (See Section 7, Mirroring.) These characters 
are those that fit at least one of the following conditions:"

* * * * * *

Best,

-- C. E. Whitehead

Date/Time: Sun Oct 19 13:29:13 CDT 2014
Contact: cewcathar@hotmail.com
Name: C. E. Whitehead
Report Type: Public Review Issue
Opt Subject: Proposed Update UAX #9, Unicode Bidirectional Algorithm


Hi,  once more; I made an error in my comments on 3.1.3 (posted to the comments page a while back; 
this serves as a correction), on the algorithm for BDI 16 -- the original text is correct:

"If the current stack element is at the bottom of the stack, and the values match, meaning the 
two characters form a bracket pair, then Append the text position in the current stack element 
together with the text position of the closing paired bracket to the list. Pop the stack through 
the current stack element inclusively. Else, if the current stack element is not at the bottom of 
the stack, advance it to the next element deeper in the stack and go back to step 2."

{COMMENT: leave as is; my error}

Best,

--C. E. Whitehead