RE: numeric properties of Nl characters in the UCD

From: Arcane Jill (arcanejill@ramonsky.com)
Date: Thu Nov 27 2003 - 05:52:29 EST

  • Next message: Philippe Verdy: "RE: numeric properties of Nl characters in the UCD"

    Gotcha. It's all starting to make sense now. Including the opposition to
    hex.

    Maybe one could make "circled 92" in two stages: (1) create a glyph
    representing 92, then (2) apply an enclosing circle modifier to it.
    Except of course, that wouldn't work! Because a modifier only affects a
    single base character. Basically, you'd need to do:

        encircle( "9" + "2")

    instead of

        "9" + encircle("2")

    But there doesn't seem to be any way of specifying operator precedence
    in Unicode text (by which I mean the precedence of ZWJ compared with the
    precedence of any modifier). I can see a case for "invisible brackets"
    here to control such precedence.

    The review on Ethiopic and Tamil non-decimal digits is interesting, but
    I can't help but feel it was a culturally biased decision (read:
    mistake) to EVER have had a "radix ten" property without any similar
    property for any other radix, thereby forcing non-decimal digits to end
    up being classified as No (Other_Number) instead of Nd (Number_Decimal).
    It's a mistake because, even in /my/ culture, digit one followed by
    digit two is not always interpretted as the number twelve. Phone numbers
    and PINs are one exception. Version numbers such as "version 12.12.12"
    are another exception. Octal is another

    One implication is that hexadecimal numbers cannot be expressed in
    Unicode without violating this property. For instance, is the string
    "U+0012" valid Unicode, given that "the sequence of the ONE character
    followed by the TWO character is [NOT] interpreted as having the value
    of twelve"?

    Perhaps it would have made sense to simply have different properties all
    round, such as: "number positional" for digits in any radix; "number
    integer" for integer types such as circled 2 which can't be used
    positionally; "number fraction" for fractions, and "number other" for
    everything else. Or maybe some other similar scheme. Is it too late to
    change things now?

    Jill

     -----Original Message-----
    From: Philippe Verdy [mailto:verdy_p@wanadoo.fr]
    Sent: Thursday, November 27, 2003 10:24 AM
    To: Arcane Jill
    Cc: Unicode@Unicode.Org
    Subject: RE: numeric properties of Nl characters in the UCD

        26 Update properties for Ethiopic and Tamil non-decimal digits
        2003.01.27 Decimal numbers are those using in decimal-radix
        number systems. In particular, the sequence of the ONE character
        followed by the TWO character is interpreted as having the value
        of twelve. We have gotten feedback that this is the not the case
        for Ethiopic or Tamil. Details are
        on the public issues page.

    This is related to the definition of decimal digits...
    So a decimal digit property should mean that the character is usable in a
    positional system to compose numbers with a radix 10.

    Circled digits don't have a "decimal digit" property because they are not
    composable to create numbers in a positional system (either left-to-right
    or right-to-left). But may be they could if one used ligatures like:
    <CIRCLED DIGIT NINE, ZWJ, CIRCLED DIGIT TWO> to create an abstract
    character <CIRCLED NUMBER NINETY TWO>. How it will be rendered is another
    problem, but the need to use a control format with such private encoding
    convention will still prohibit the character to be assigned a general
    purpose decimal property.



    This archive was generated by hypermail 2.1.5 : Thu Nov 27 2003 - 06:57:17 EST