from 4 to null (was: 3 big bidi bugs)

From: Bernard Miller (Bernard_R_Miller@bytext.org)
Date: Fri May 31 2002 - 17:07:36 EDT


Mark Davis wrote:
> One could wish for a simpler algorithm (for that matter, one could
> wish that people had uniform writing directions, or that Brits would
> drive on the right side of the road). As to ByText, you are on your
> own (in many ways).

ByText? What’s that? One could wish for a simpler algorithm and get it using
Bytext. Anyway, I didn’t ask about Bytext.. I know all about it :) --I did
however ask if you could verify that no implementation of the Unicode
bidirectional algorithm works with the Unicode 3.20 compatible “from 4 to
null” logic (below). It should be easy for you to do this at least for ICU,
but you were strangely silent on this question. Bidirectional users might
like to know that there are no bugs in their <= 3.20 implementations.

___from “3 bidi bugs” thread:
Let's say you have a line consisting of characters with all embedding level
4... How is "3" considered to be the lowest odd level on that line? It's no
more the lowest odd level than 5 or 1 is. At best, if you consider a
character with embedding level 4 to actually consist of 4 and each lower
embedding level (4, 3, 2, 1, and zero), which is not entirely unreasonable,
then 1 will always be the lowest odd embedding level on every line except a
line consisting of all zero's. But since L2 doesn't say "...to 1", it rules
out this interpretation.

A function implementing L2 might go thru the following steps on each line:
1. find the highest level
2. find the lowest odd level
...
For a line consisting of all 4's as above, step 1 will return 4 and step 2
should return null since there are no odd levels on the line. A list
consisting of "from 4 to null" can only reasonably be interpreted as
consisting only of 4. Going on with this you get the "bugs" I describe.
___

---
Bernard Rafael Miller, email: bernard_r_miller@bytext.org
Format enabling simplified 8 bit regexes of UCS characters: www.bytext.org
---
"Progress is a nice word. But change is it's motivator and change has it's
enemies."
--Robert F. Kennedy



This archive was generated by hypermail 2.1.2 : Fri May 31 2002 - 15:36:45 EDT