Behaviour of ZWJ that PR-37 has not considered

From: Shriramana Sharma (samjnaa@gmail.com)
Date: Tue Aug 25 2009 - 10:44:08 CDT

  • Next message: Rick McGowan: "Re: Document on usage of Reph in Gurmukhi and Telugu"

    This is with regard to the behaviour of ZWJ as outlined in the paper
    http://unicode.org/review/pr-37.pdf. I have summarised it in the
    attached text file as I understood it.

    Please see the question marks. It is not mentioned in PR-37 how the
    rendering should be when ZWJ + Virama is used with C1-conjoining
    consonants and when Virama + ZWJ is used with C2-conjoining consonants.
    Does anyone have any suggestion for this?

    As I see it, there are two options, the relaxed one and the rigorous one.

    The relaxed option:

    The combination of ZWJ and Virama, in whatever order, will be understood
    by a lay user to be something that prevents ligature formation (level
    1). Therefore, the ZWJ in these sequences (marked with ? in my table)
    should not be ignored and hence the desired prevention of level 1 must
    be implemented by restriction to level 2 or less.

    When a C1-conjoining consonant joins with a C2-conjoining consonant, the
    prescribed method is unchanged. ZWJ + Virama will cause C2-conjoining
    behaviour and Virama + ZWJ will cause C1-conjoining behaviour. It is
    only suggested that ZWJ + Virama cause C1-conjoining behaviour (or less)
    where there is no C2-conjoining consonant and that Virama + ZWJ cause
    C2-conjoining behaviour (or less) where there is no C1-conjoining consonant.

    The advantage of this scheme is that the user need not go into details
    like C1-conjoining or C2-conjoining behaviour. All the user wants is to
    use a conjoining form instead of a ligature. Allowing either sequence to
    be used will be more liberal. The cases of ambiguous conjoining
    behaviour are rare enough that the distinction of the two sequences in
    that limited context need not be learnt for the majority of users.

    The rigorous stand:

    ZWJ + Virama should not be detached from its C2-conjoining enforcing
    behaviour. Likewise Virama + ZWJ should not be detached from its
    C1-conjoining behaviour. It will only create confusion as to the proper
    behaviour of these two sequences, and leave the user ignorant as to how
    to achieve the proper behaviour in case of ambiguous conjoining
    behaviour. The best example of such ambiguity - RA + Virama + YA - is
    not so uncommon.

    Therefore in the sequences:

    Cx + Virama + ZWJ + C2
    C1 + ZWJ + Virama + Cx

    the ZWJ should be ignored and the highest level possible (even level 1)
    should be allowed.

    Comments?

    -- 
    Shriramana Sharma
    
    




    This archive was generated by hypermail 2.1.5 : Tue Aug 25 2009 - 19:22:56 CDT