Re: Nicest UTF

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Dec 08 2004 - 16:51:55 CST

  • Next message: Patrick Andries: "Re: IUC27 Unicode, Cultural Diversity, and Multilingual Computing / Africa is forgotten once again."

    Marcin 'Qrczak' Kowalczyk scripsit:

    > String equality in a programming language should not treat composed
    > and decomposed forms as equal. Not this level of abstraction.

    Well, that assumes that there's a special "string equality" predicate, as
    distinct from just having various predicates that DWIM. In a Unicode Lisp
    implementation, e.g., equal might be char-by-char equality and equalp might not.

    > They are supposed to be equivalent when they are actual characters.
    > What if they are numeric character references? Should "<&#824;"
    > (7 characters) represent a valid plain-text character or be a broken
    > opening tag?

    It's a broken opening tag.

    > Note that if it's a valid plain-text character, it's impossible
    > to represent isolated combining code points in XML,

    It's problematic to represent the *specific* combining code point
    when it appears immediately after a tag.

    -- 
    Don't be so humble.  You're not that great.             John Cowan
            --Golda Meir                                    jcowan@reutershealth.com
    


    This archive was generated by hypermail 2.1.5 : Wed Dec 08 2004 - 16:53:08 CST