From: Jim Allan (jallan@smrtytrek.com)
Date: Fri Nov 28 2003 - 08:02:49 EST
Arcane Jill posted:
> Understand that this is not MY definition. It came from Unicode public
> review issue #26 (http://www.unicode.org/review/pr-26.html). From my
> (purely logical) point of view, a definition is not something you can
> agree with or disagree with. It is simply an axiom from which
> conclusions may be logically derived.
Anyone can disagree with a particular definition.
The definition in this case was:
<< Decimal numbers are those used in decimal-radix number systems. In
particular, the sequence of the ONE character followed by the TWO
character is interpreted as having the value of twelve. >>
As you pointed out, the interpretation mandated by the second sentence
is often broken in practice in strings of decimal digits.
As Doug Ewell pointed out:
> Those aren't numbers. Ha ha! Surprised? They are *character strings*
> that happen to consist (mostly) of digits.
I accept this as a correction of my interpretation of that definition.
Runs of decimal digits are not always to be interpreted as decimal numbers.
They may have octal or hex interpretation or other interpretations or be
best interpreted as an unevaluated string to digits. Retention of
leading zeros may be essential though this can be considered a
formatting issue for particular presentation of a number.
Sometimes such string are partly resolved by numeric value, e.g. a phone
number 124-088-2250 might be spoken aloud as one-twenty-four,
zero-eight-eight, twenty-two-fifty.
But when decimal digits are used as part of a *decimal number*, the
second sentence in the definition is true. Whether digits in text are
being so used is outside of the scope of Unicode.
This definition defines decimal numbers, not decimal digits. Whether a
particular string or substring of decimal digits should be interpreted
as a decimal number is outside the scope of this definition.
> If it isn't, then it is my
> /original/ question (how does the Unicode consortium define the "decimal
> digit" propery) is the one which remains unanswered.
The Unicode glossary at
http://www.unicode.org/versions/Unicode4.0.0/b1.pdf reads:
<< _Decimal Digits._ Digits that can be used to form decimal-radix
numbers. >>
The use of "can" here implies that such these characters might have
other uses.
From http://www.unicode.org/Public/UNIDATA/UCD.html:
<< If the character has the _decimal digit_ property, as specified in
Chapter 4 of the Unicode Standard, then the value of that digit is
represented with an integer value in fields 6, 7, and 8. >>
Unicode 4.0 section 4.6 reads:
<< _Decimal digits_ form a large subcategory of numbers consisting of
those digits that can be used to form decimal-radix numbers. They
include script-specific digits, not characters such as Roman numerals
(<1, 5>) = 15 = fifteen, but <I, V> = IV = four), subscripts, or
superscripts. Numbers other than decimal digits can be used in numerical
expressions, but it is up to the user to determine the specialized uses. >>
See also 5.5 for further discussion.
Jim Allan
This archive was generated by hypermail 2.1.5 : Fri Nov 28 2003 - 11:09:59 EST