From: Mount, Rob (Robert F) (rfmount@ingr.com)
Date: Wed Jun 04 2003 - 19:11:48 EDT
All,
I am investigating differing behavior in various environments of the
wide-character version of the C function isAlpha with respect to
character U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK. Some
implementations indicate that it is alphabetic, some don't. I
suspect that other characters might be subject to the same confusion.
The UNICODE documents seem abiguous on this point: the General
Catetory is "Lm" which, although informative instead of normative,
would seem to imply that it is alphabetic; likewise
DerivedCoreProperties-4.0.0.txt indicates that it is alphabetic; but
PropList-4.0.0.txt contains two records - one indicating that it is
a diacritic, one that indicates it is an extender.
On to my questions:
Q1: Can a character be both alphabetic and diacritic?
Q2: Is there a difinitive answer as to whether this is an alphabetic
character?
Thanks in advance for answers to these questions and/or any
additional isight you can provide.
Regards,
Rob Mount
This archive was generated by hypermail 2.1.5 : Wed Jun 04 2003 - 20:00:49 EDT