From: Arcane Jill (arcanejill@ramonsky.com)
Date: Thu Nov 27 2003 - 05:52:29 EST
Gotcha. It's all starting to make sense now. Including the opposition to
hex.
Maybe one could make "circled 92" in two stages: (1) create a glyph
representing 92, then (2) apply an enclosing circle modifier to it.
Except of course, that wouldn't work! Because a modifier only affects a
single base character. Basically, you'd need to do:
encircle( "9" + "2")
instead of
"9" + encircle("2")
But there doesn't seem to be any way of specifying operator precedence
in Unicode text (by which I mean the precedence of ZWJ compared with the
precedence of any modifier). I can see a case for "invisible brackets"
here to control such precedence.
The review on Ethiopic and Tamil non-decimal digits is interesting, but
I can't help but feel it was a culturally biased decision (read:
mistake) to EVER have had a "radix ten" property without any similar
property for any other radix, thereby forcing non-decimal digits to end
up being classified as No (Other_Number) instead of Nd (Number_Decimal).
It's a mistake because, even in /my/ culture, digit one followed by
digit two is not always interpretted as the number twelve. Phone numbers
and PINs are one exception. Version numbers such as "version 12.12.12"
are another exception. Octal is another
One implication is that hexadecimal numbers cannot be expressed in
Unicode without violating this property. For instance, is the string
"U+0012" valid Unicode, given that "the sequence of the ONE character
followed by the TWO character is [NOT] interpreted as having the value
of twelve"?
Perhaps it would have made sense to simply have different properties all
round, such as: "number positional" for digits in any radix; "number
integer" for integer types such as circled 2 which can't be used
positionally; "number fraction" for fractions, and "number other" for
everything else. Or maybe some other similar scheme. Is it too late to
change things now?
Jill
-----Original Message-----
From: Philippe Verdy [mailto:verdy_p@wanadoo.fr]
Sent: Thursday, November 27, 2003 10:24 AM
To: Arcane Jill
Cc: Unicode@Unicode.Org
Subject: RE: numeric properties of Nl characters in the UCD
26 Update properties for Ethiopic and Tamil non-decimal digits
2003.01.27 Decimal numbers are those using in decimal-radix
number systems. In particular, the sequence of the ONE character
followed by the TWO character is interpreted as having the value
of twelve. We have gotten feedback that this is the not the case
for Ethiopic or Tamil. Details are
on the public issues page.
This is related to the definition of decimal digits...
So a decimal digit property should mean that the character is usable in a
positional system to compose numbers with a radix 10.
Circled digits don't have a "decimal digit" property because they are not
composable to create numbers in a positional system (either left-to-right
or right-to-left). But may be they could if one used ligatures like:
<CIRCLED DIGIT NINE, ZWJ, CIRCLED DIGIT TWO> to create an abstract
character <CIRCLED NUMBER NINETY TWO>. How it will be rendered is another
problem, but the need to use a control format with such private encoding
convention will still prohibit the character to be assigned a general
purpose decimal property.
This archive was generated by hypermail 2.1.5 : Thu Nov 27 2003 - 06:57:17 EST