digits and numbers

From: Viranga Ratnaike (viranga@mds.rmit.edu.au)
Date: Fri Jun 11 1999 - 03:28:04 EDT


Hello,

        apologies if these questions are too trivial for this list. I've
        looked at section 4.6 (The Unicode Standard, Version 2.0). I guess
        I'm just trying to get a clearer notion of what numbers and digits are.
        The names of some abstract characters contain the word "DIGIT" even if
        they are not classified as digits (Nd).

        I'm writing an isDigit() function for a C++ string class and I'm
        preprocessing the UnicodeData.txt file to provide the information.

        The version I'm using is UNICODE 2.1 CHARACTER DATABASE (update 2.1.9)

        I need to determine if a code corresponds to a digit.

        The simplest thing for me to do is just check the whether the General
        Category is Nd (Number, Decimal Digit).

        What's concerning me is that the codes for the Chinese characters
        (yi, er, san, si, wu, liu, qi, ba, jiu, ling...) are in a character
        range and the General Category for this range is Lo (Letter, Other).
        So my function would return false for these codes. I guess I can
        sort of live with that even tho' to my mind they seem to be digits.

        *** Are these Chinese characters not digits or possibly not numbers?

        The circled ideographs, 3280 to 3289, are classified as numbers "No".
        It seems that they are considered numbers unless they are decomposed.
 

        *** Can characters belong to more than one general category?

        *** Can characters change their General Category, upon composition,
                even if they are homogenous to one category when decomposed?
                

        I noticed that the Hangzhou style numerals (3021 to 3029) were
        classified as Nl (Number, Letter) even tho' they seem to be decimal
        (without the zero but I assume they use DIGIT ZERO). I don't want
        to include Category "Nl" because that would erroneously return true
        for roman numerals.

        
        *** Is there a reason for this special treatment of Hangzhou numerals?

        *** And, ummm... this is just curiosity

                - what are the Tibetan half digits used for?

Regards,

        Viranga (viranga@mds.rmit.edu.au)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT