From: Doug Ewell (dewell@adelphia.net)
Date: Wed Feb 12 2003 - 02:16:55 EST
Chris Jacobs <c dot t dot m dot jacobs at hccnet dot nl> wrote:
> I did not mean to ask to display the ASCII sequence "\u05e0" as
> "05e0u\"
> I meant something else.
I THINK what you meant is that the ASCII sequence "\u0530", surrounded
on both sides by real RTL characters, should appear as one continuous
RTL string, instead of breaking the real RTL characters into two
separate strings. The cursor would move LTR through the ASCII
characters \ u 0 5 3 0 but RTL overall through the string.
This isn't how the bidirectional algorithm works with ASCII characters,
though. Each Unicode character has a directionality property. Some are
strong LTR or RTL, some are weak LTR or RTL, and some are neutral --
their directionality is completely determined by the characters around
them. (Sort of like the politics of some people I know.)
ASCII characters are strong LTR, which means they will break up an RTL
sequence in the manner you are seeing. This is true even if the ASCII
characters combine to form a commonly understood notation representing
some other Unicode character. The bidirectional algorithm doesn't do
any form of semantic analysis on the text. To do so would constitute a
customized, or I guess the word now is "tailored," version of the
bidirectional algorithm, which might be great for some purposes such as
the one you describe, but which wouldn't be Unicode-conformant. UniPad
is simply being Unicode-conformant in this regard.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.5 : Wed Feb 12 2003 - 03:00:31 EST