Classification of U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK

From: Mount, Rob (Robert F) (rfmount@ingr.com)
Date: Wed Jun 04 2003 - 19:11:48 EDT

  • Next message: William Overington: "Re: Address of ISO 3166 mailing list"

    All,
    I am investigating differing behavior in various environments of the
    wide-character version of the C function isAlpha with respect to
    character U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK. Some
    implementations indicate that it is alphabetic, some don't. I
    suspect that other characters might be subject to the same confusion.

    The UNICODE documents seem abiguous on this point: the General
    Catetory is "Lm" which, although informative instead of normative,
    would seem to imply that it is alphabetic; likewise
    DerivedCoreProperties-4.0.0.txt indicates that it is alphabetic; but
    PropList-4.0.0.txt contains two records - one indicating that it is
    a diacritic, one that indicates it is an extender.

    On to my questions:

    Q1: Can a character be both alphabetic and diacritic?

    Q2: Is there a difinitive answer as to whether this is an alphabetic
    character?

    Thanks in advance for answers to these questions and/or any
    additional isight you can provide.

    Regards,
    Rob Mount



    This archive was generated by hypermail 2.1.5 : Wed Jun 04 2003 - 20:00:49 EDT