Gaspar Sinai has a valid point insofar as there is a possible
ambiguity in bidi text. However, he is absolutely wrong in
blaming the Unicode bidi algorithm for this problem.
Gaspar Sinai had written:
> change products or to change the standard and use
> a reversable bidi.
and later:
> Hold on there! You admit that unicode alrgorithm is *really*
> not reversable?
He completely failed to acknowledge the fact that the bidi rendering
process is intrinsically not reversible, in the general case. And he
did not mention John Cowan's (IIRC) simple example illustrating this
fact. In other words, he is discussing on wrong premise, so his con-
clusions are not sound.
If he will not get his basic facts right, this whole discussion is
indeed surreal, and mostly a waste of time.
Gaspar Sinai wrote:
> Just because some companies who have influence on Unicode
> Consortium use some algorithm, like backing store and re-mapping,
> it does not mean that this is the only way. [...]
> Yudit does convert the input to view order and back.
Now, this reveals the real problem.
From this description, I gather that Gaspar's editor does not
preserve the backing store, hence it has to reconstruct it from
the rendering. As the rendering process is a n->1 mapping, its
reverse is, intrisically, ambiguous. So, the attempt to recon-
struct the original character sequence from the vsual appearance
is bound to fail, in the general case.
Now Gaspar asks everybody else to comply with his own approach,
and does not even see that this approach will not work!
> Text direction and end of line is clearly indicated. The Unicode
> values of the characters in the cluster under the cursor are
> clearly indicated.
These are good features to have in a decent editor; but they are
entirely unrelated to the perceived problem. They can easily be
implemented in an edtor that keeps the backing store.
> In all cases what you view be converted back to
> the *same* bitstream - except for illegal encoded text but that
> leaves clearly visible traces in the screen, as it should.
Fine. And a lot easier to attain, if the original bitstream is
not discarded, in the first place ;-)
> If the standard wants me to confuse the user, I would rather dump the
> standard than comply.
That is certainly not the standard's aim. Rather the bidi part of the
standard wants to describe established practice for bidi writing.
> I updated:
> http://www.yudit.org/security/
It would be honest to describe the facts, as they are in reality,
and not overstate, or even falsify, them in order to drive a point
home. E. g.:
> Unicode Bidirectional Algorithm is non-reversable.
Rather:
Bidi text may be ambiguous, if you cannot determine where to start
reading. E. g.
the arabs = SBARA EHT
(where uppercase represents the arabic equivalent, written right to left)
can be read from either side. Nested levels of RTL, and LTR, clauses may
render the interpretation of bidi text even more problematic.
The ambiguity is normally resolved in one of two ways:
- The starting direction is determined from the context, e. g. you
would start reading the preceding example from the left, as it is
embedded in an English (i. e. LTR) paragraph; you would start reading
this very same line from the other side if it were embedded in an
Arabic (i. e. RTL) paragraph.
- Embedded levels are usually delimited by quotes, or other con-
textual hints.
> That means that if text converted back from display order we can not get
> back the same text.
Rather:
... we will not get back the same text, in every single case.
> Imagine somone signing a digital unicode document. He is looking at
> his viewer but what he signs is the bitstream.
He is probably signing a document that he has entered himself.
Where could the ambiguity come from, if he has not deliberatly intro-
duced it, himself?
> At yudit.org we advise you: please never sign digitally a Unicode
document -
> or sign it knowing your own risk.
Rather:
Make sure what you sign, in particular regarding bidi documents.
If you want to sign the clauses you entered, in logical order, then
sign your e-mail (or other Unicode text); if you want to sign the
rendering, then apply your signature to an image, or pdf, file. In
both cases, try to express your points (particularly the nesting of
clauses written in oppsite directions) as unambiguously as possible.
Btw.: Decent software should make clear and obvious to the user
what he is really signing.
Best wishes,
Otto Stolz
This archive was generated by hypermail 2.1.2 : Tue Feb 05 2002 - 11:32:22 EST