Error in context specification for Casing

From: Aleksander Morgado (aleksander@es.gnu.org)
Date: Sat Apr 12 2008 - 16:46:37 CDT

  • Next message: JFC Morfin: "Re: "French+" support by Unicode"

    Hi all,

    Some months ago I filled an error report in the Unicode website related
    to the 'More_Above' context specification for casing. This error was in
    Unicode 5.0, and the error report was as follows:
    "I believe there is an error in the description of the 'More_Above'
    context specification for casing (Table 3-14, page 124). According to
    the regular expression provided, the wording of the description should
    say "C is followed by a character of combining class 230 (Above) with no
    intervening character of combining class 0". The last part of the
    sentence provided in the standard ("or 230 (Above)") should be removed."

    I guess people in Unicode analyzed the error report, and now this is
    what we have in the errata list (http://www.unicode.org/errata/):
    "On p. 124 of The Unicode Standard, Version 5.0, there is an error in
    the Regular Expressions column for "More_Above", in the third row of
    Table 3-14, Context Specification for Casing. The corrected regular
    expression should be: [^\p{ccc=230}\p{ccc=0}]* [\p{ccc=230}] "

    IMHO the problem remains in the wording of the 'Description' column in
    table 3-14, and not in the regular expression (the old one was:
    [^\p{ccc=0}]* [\p{ccc=230}] ). The idea is that if the character C must
    be followed by a character of combining class 230, we shouldn't check
    for intervening characters of combining class 230.

    What do you think?

    -Aleksander Morgado



    This archive was generated by hypermail 2.1.5 : Sat Apr 12 2008 - 16:49:58 CDT