From: Peter R. Mueller-Roemer (pmr@cs.uni-frankfurt.de)
Date: Fri Aug 05 2005 - 08:49:11 CDT
Over a year ago I there was only mild interest for my problems of
editing multilingual LINES (bidi, not one-directional paragraphs). Now
I'm overwhelmed and do not find time to even read all your reactions.
Gregg Reynolds was the first to understand and support me. Then the
discussion branched out to
1. digits . Enclosed by RTL-text it is very simple to enter
ascii-numbers in the usual LTR manner and I would find it confusing to
have to enter R-digits in 'reverse' order. I guess, there might be some
problem with the automatic wrapping?
But the RTL-text imbedded in a LTR-text presently wraps correctly at
right. Good!
I had no problem of cutting and pasting such text. In such contexts the
mathematical meaning of a sequence of digits is usually clear from the
context (I have seen only LTR-meaning, I have seen occasionally .1, .2
... as verse-numbers at R of the Hebrew text, and pasting such text
caused some problems in resulting layout.)
2. parentheses within RTL-text surrounding RTL-text are most easily
typed as a pair (you won't forget to close it) and then the RTL-text
entered.
Now, Richard T. Gillam has convinced me that my suggestion of an R-SPace
is not the simplest solution, and joins me in suggesting a rule /
recommendation by Unicode to the TextProcessingSW-, GUI-,
KeyboardLayout-makers of how to deal with the typed input and its
graphiocal representation. I would like to add and to the storage of
the resulting text (automatic elimination of pairs of opposite direction
markers).
So let us concentrate on how to best support the typing of mixed RTL and
LTR text and its graphical and storage representation. It is indeed not
a character-encoding problem, but of coding and decoding keyboard
out-put and resulting input to editor-SW so that the
cursor/next-character position is not jumping back and forth when
entering a Space into RTL-text within an LTR-paragraph.
I have designed my own tri-lingual keyboard with
MS-KeyboardLayoutCreator but am frustrated with not seeing a way to
generate the 3-character R-Space (ending with the PDF-character) and not
being able to re-code the Arrow-,BS-, Del-Keys. Also I miss the
opportunity to add a per character documenting comment.
See more inserted below.
Peter R. Mueller-Roemer
Richard T. Gillam wrote:
>>You missed a strong point of the 'jumping cursor' problem:
>>SAME LINE typing of text of different directionality should be
>>
>>
>supported
>
>
>>even better. The typing of such texts is pretty well supported in
>>Unicode, so that most Editors and Textprograms can do it even without
>>providing R2L-PARAGRAPHS. Switching in the middle of the line the
>>directionality of the paragraph has very undesirable effects.
>>I should be able to just switch the keyboard-layout and not enter extra
>>
>>
>
>
>
>>directionality characters and later PDF.
>>
>>
>
>Sure, but this isn't a text-encoding issue. It's a keyboard-layout and
>editing-UI issue. It might be worth it for Unicode to publish a
>Technical Note or something recommending best practices for dealing with
>bidirectional text, but that would be the extent of Unicode's
>involvement. Nothing you complain about above presents a strong case
>that things are amiss at the _encoding_ level. Furthermore, even if
>they did, they don't present a case that the cure you're recommending
>would be better than the disease.
>
>
How can we best promote such a Technical note? I am tired of devising
special work-arounds for different editors, ...
>To take another example from Hebrew, I think pretty much everybody
>agrees that the fixed-position combining class assignments for the
>points were a bad idea, and that they make properly handling Biblical
>Hebrew a big pain in the butt, but it was also widely judged that the
>obvious cure (changing the combining classes) would be worse than the
>disease. They did fix the problem, but it wasn't nearly as simple a
>fix.
>
>
Do we have a solution now for the need to combine vowel-point AND
cantilation-marks under the same base character?
Why do these combining marks refuse to combine under the precomposed
consonants with dagesh?
Also in Greek I don't like it if I can't combine a spiritus lenis with
an accute accent side by side (they are represented in outdated overtype
mode). This seems an opportunity for Unicode to suggest some reasonable
rules of graphical representation and micro-editing of
combining-sequences with diacritical side-by-side or on top - e.g. by
DousosSIL)
>The big difference between that case and this one is that there with the
>vowel points, there was a problem at the encoding level-- there were
>things that occurred in real text that simply couldn't be represented.
>This problem was fixed, but in a suboptimal way. Here, we're not
>talking about things that can't be represented at all; we're talking
>about editing UI.
>
>
And I plead for some design and guidance so that the UI's don't continue
to diverge.
>
>
>>I have thought of how to change editors to behave as I need it for
>>MULTILINGUAL LINES and came to the conclusion that the easiest, would
>>
>>
>be
>
>
>>to change some keyboard layouts to provide a SPace, TAB and selected
>>brackets, diacritics, punctuation with the other directionality. Extra
>>code-points for these would be much preferable to keystrokes sending
>>
>>
>the
>
>
>>old characters enclosed in a pair of directionality characters.
>>
>>
>
>Not necessarily. There are lots of reasons why the sequence of three
>characters is preferable. From the end-user perspective, all that
>matters is that the software "work right." Whether the keyboard driver
>is generating one character code or three should be completely
>immaterial to the end user. (And the fact that the three-character
>solution is available is precisely why this isn't an encoding problem.)
>
>I'm not arguing that the issue you describe isn't a problem; I'm simply
>arguing that it's a problem with your software, not with Unicode.
>
>--Rich Gillam
> LAS
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Aug 05 2005 - 08:52:47 CDT