Re: Jumping Cursor. Was: Right-to-Left Punctuation Problem

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Aug 02 2005 - 15:27:40 CDT

  • Next message: Richard Wordingham: "Re: Jumping Cursor. Was: Right-to-Left Punctuation Problem"

    From: "John Hudson" <tiro@tiro.com>
    > Now, when it comes to things like parentheses, the mirrored stuff does my
    > head in and I really don't see the point of it. I'm guessing that it
    > confuses application developers also, since it is implemented with so
    > little consistency.

    Just remember that what is really encoded for parenthesis is their semantic
    (open/start or close/end), not the look of their glyph. The "mirrored"
    property in fact only affects the glyph orientation, but parentheses remain
    weak as regard to their directionality (i.e. the direction of the cursor
    movement when the logically encoded string is completed with the character).

    This can make this complicate in this case. Suppose that lowercase
    characters below are strong LTR (latin/greek/cyrillic...) letters, and the
    uppercase characters are strong RTL (arabic/hebrew) letters. Then how will
    you interpret the following rendered string, if you only look at the
    rendered document without knowing its encoding:

    Encoded as: "latinlatin (ARABICARABIC (latinlatin CIBARA) latin))".
    Rendered as: "latinlatin () CIBARACIBARAlatinlatin (CIBARA latin))".

    It will be difficult to say when looking only at the rendered document which
    parenthese is opening and which is closing. The mirrored property will work
    well only if the direction context is the same before the begining and at
    end of the surrounded sequence. The tricky cases occur if parentheses are
    nested and also surrund sections of text with different directionality.

    Now suppose that characters had not been mirrored. One would have needed to
    encode the OPEN PARENTHESE as ')' in a RTL context, so the semantic of the
    same character would have been lost, in profit of the invariability of the
    glyph direction. Anyway, you would have then encoded this for the same
    logical text:

    Encoded as: "latinlatin (ARABICARABIC )latinlatin CIBARA( latin))".
    Rendered as: "latinlatin () CIBARACIBARAlatinlatin (CIBARA latin))".

    to finally get the same result... It would have just been more complicate to
    enter regular Arabic-text only.

    Now suppose you wanted to use distinct characters with RTL directionality
    (suppose below that the Arabic parentheses are noted with ']' for
    start/opening, and '[' for end/closing.) You would have encoded this for the
    same logical text. Note that because the new punctuation would be explicitly
    RTL, they would not need to be mirrored:

    Encoded as: "latinlatin (ARABICARABIC ]latinlatin CIBARA[ latin))".
    Rendered as: "latinlatin (] CIBARACIBARAlatinlatin [CIBARA latin))".

    Would that be really more satisfactory for the interpretation? No.

    Conclusion, it's hard to determine the effective semantics of mirrorable
    characters outside of the simplest cases where they are used: to surround
    text with consistent directionality between its start and end. Having to
    duplicate characters to avoid mirroring or swap of directionality does not
    help simplifying the problem. It's then best not to duplicate needlessly the
    mirrorable characters with weak directionality.

    Users do perceive these RTL parentheses characters identically with the same
    semantics, and there's no need to add to the confusion. Duplicating these
    codes won't help, notably because they don't have effectively distinct
    glyphs. Things are different when the representative glyphs are arguably
    distinctable, so that it is non-sense to make them mirrored. see for example
    the encoded differences between () and [] pairs. If a script has its own
    distinctable glyphs for parenthese pairs, there's no need to mirror them.
    Else it's best to keep the mirrored property, and thus keep the characters
    with weak directionality.



    This archive was generated by hypermail 2.1.5 : Tue Aug 02 2005 - 15:29:37 CDT