RE: Character properties

From: Marco.Cimarosti@icl.com
Date: Mon Oct 23 2000 - 04:48:52 EDT


Marcin Kowalczyk wrote:
> isDigit: Nd
> isHexDigit: '0'..'9', 'A'..'F', 'a'..'f'
> isDecDigit: '0'..'9'
> isOctDigit: '0'..'7'

The definition "Nd" is what I would have proposed for isDecDigit. In
general, I would consider any script's digit for decimal and octal numbers.
Not so for hex numbers, that are probably strictly bound to computer
programming languages and, hence, to the Latin script.

What is the meaning of isDigit? The intuitive meaning would be "Any kind of
digit, as defined by the three specific functions below".

To isHexDigit I would also add the fullwidth letters and digits in block
U+FFxx (which are compatibility clones of ASCII characters used in a CJK
environment).

So, I would say:

        isDigit: isDecDigit OR isHexDigit
        isHexDigit: '0'..'9', 'A'..'F', 'a'..'f', U+FF10..U+FF19,
U+FF21..U+FF26, U+FF41..U+FF46
        isDecDigit: Nd
        isOctDigit: Nd where digit's numerical value < 8

> isUpper: Lu, Lt
> isLower: Ll

I would say that "Lt" letter are *both* uppercase and lowercase. So, my bit
is:

        isUpper: Lu, Lt
        isLower: Ll, Lt

Or alternatively, if you can (and wish to) add a new API entry:

        isUpper: Lu
        isLower: Ll
        isTitle: Lt

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT