RE: Character properties

From: Marco.Cimarosti@icl.com
Date: Fri Sep 22 2000 - 08:54:14 EDT


Marcin Kowalczyk wrote:
> Thu, 21 Sep 2000 23:55:24 +0330 (IRT), Roozbeh Pournader
> <roozbeh@sina.sharif.ac.ir> pisze:
> > I disagree with the isDigit case, simply because my main language,
> > Persian, uses alternate digits when written.

I agree with Roozbeh in disagreeing (sorry for the pun), even if my language
uses the good old ASCII "Arabic" digits.

> Do they form numbers in the same way as ASCII digits?

The syntax is 100% identical; of course the characters for digits, decimal
separators, etc. are different.

Notice however the exception of Tamil "digits" which, despite the name, have
actually nothing to do with decimal digits (there is no zero but, in turn,
there are characters for "tens", "hundreds", etc.).

> Does Unicode character database provide a way to tell which digits
> form numbers in this way (decimal, "big Endian")?

Yes. You have 3 boolean properties for numerical characters:

- *numeric* is "Y" for characters is used to represent numbers (includes,
e.g., Roman numerals);

- *digit* is "Y" for any sort of "digits" (includes, I think Tamil digits);

- *decimal digit* is the property you want: it is "Y" for positional decimal
digits.

> Do you think that they (and digits from other languages) should
> be recognized as numbers in sources for programming languages that
> generally accept foreign letters in identifiers? (I don't know what
> Haskell gurus would say for that.)

I say yes, if you ask me.

(BTW, *all* the letters used on your awful planet are "foreign". At least,
that's how we see it here on Klingon. :-)

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT