Viranga Ratnaike wrote:
> The simplest thing for me to do is just check the whether the General
> Category is Nd (Number, Decimal Digit).
Yes. The purpose of Nd is to allow you to do arithmetic manipulations;
assuming base-10 numbers, you can assume that (say) 3 consecutive
Nds means 100 * a + 10 * b + c. Other N characters cannot be
assumed to work that way.
>
> What's concerning me is that the codes for the Chinese characters
> (yi, er, san, si, wu, liu, qi, ba, jiu, ling...) are in a character
> range and the General Category for this range is Lo (Letter, Other).
That's because they are the symbols for the *linguistic* forms
(words, zi4) "one", "two", "three" etc.
> The circled ideographs, 3280 to 3289, are classified as numbers "No".
> It seems that they are considered numbers unless they are decomposed.
They are used as symbolic dingbats, not to form numeric strings
like "256" or "3.141592653".
> I noticed that the Hangzhou style numerals (3021 to 3029) were
> classified as Nl (Number, Letter) even tho' they seem to be decimal
> (without the zero but I assume they use DIGIT ZERO). I don't want
> to include Category "Nl" because that would erroneously return true
> for roman numerals.
Hangzhou numerals, IIRC, are typically used in the form
"2 hundred 5 ten 6" for "256", which means they cannot be Nd.
> - what are the Tibetan half digits used for?
"2 5 half-6" means 255.5 (or is it 256.5, I forget).
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT