Re: Some control characters test cases

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Sat Sep 08 2007 - 14:44:54 CDT

  • Next message: Simon Montagu: "Re: Some control characters test cases"

    On 9/8/2007 6:49 AM, Itai Bar-Haim wrote:
    > Hi everyone.
    > I'm new to this mailing list, and to unicode.
    > I develop a Bidi/Unicode library for the .Net environment called NBidi
    > (http://nbidi.sf.net).
    > While running test cases, I found problems regarding control
    > characters. I'll only ask about one scenario in this post.
    > The problematic test case is as follows:
    > Given text: <RLO>abc<PDF>
    > What should I expect as a result? My expectation would be (visual,
    > control characters removed) 'cba'.
    > If I leave the control characters in place I would expect: <PDF>cba<RLO>
    > The actual result I get is <PDF>abc<RLO>. This is because of:
    > Rule X4 sets the embedding levels to: 11111
    > Then rule I2 sets the embedding levels to: 12221
    > When performing reordering we get: <RLO>cba<PDF> ==> <PDF>abc<RLO>
    >
    > Am I missing something here?
    Yes, you missed the fact that X4 not only sets the levels, but also the
    "directional override status". In X6, this causes the characters to have
    directionality R, which means that they no longer have class L and rule
    I2 does nothing.

    A./

    PS: the definition of the 'directional override status' in the UAX is a
    little less clear than it could. It currently reads:

    "BD7. The /directional override status/ determines whether the
    bidirectional type of characters is to be reset with explicit
    directional controls...".

    which leaves the impression as if the override status modifies the
    action of the controls. What was probably intended was something more like:

    "BD7. The /directional override status/ determines whether the
    bidirectional type of characters is to be reset. The override status is
    set by using explicit directional controls. ...."

    In other words, the action of the explicit controls (in X4 and X5)
    determines the override status, and the override status then gets
    applied in X6 to result in a final directional type for the characters,
    which is the used in the following rules (including L2 in this example)
    to further modify the assignment of levels.

    A./

    PS: the unibook tool at http://www.unicode.org/unibook/ runs on Windows
    and in its "tool" menu there is a bidi demo. The code driving that demo
    is based on the published bidi reference code. It gives you a quick and
    easy way to check whether your interpretation of a rule is accurate.
    >
    > Thank you in advance,
    > Itai.



    This archive was generated by hypermail 2.1.5 : Sat Sep 08 2007 - 14:47:22 CDT