Re: Origin of the U+nnnn notation

From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Tue Nov 08 2005 - 12:25:24 CST

  • Next message: Kenneth Whistler: "Re: Origin of the U+nnnn notation"

    On Tuesday, November 8th, 2005 14:04Z, Philippe Verdy va escriure:
    >
    > U-nnnn already exists (or I should say, it has existed). It was
    > refering to 16-bit code units, not really to characters and was a
    > fixed-width notation (with 4 hexadecimal digits). The "U" meant
    > "Unicode" (1.0 and before).
    >
    > U+[n...n]nnnn was created to avoid the confusion with the past 16-bit
    > only Unicode 1.0 standard (which was not fully compatible with
    > ISO/IEC 10646 code points). It is a variable-width notation that
    > refers to ISO/IEC 10646 code points. The "U" means "UCS" or
    > "Universal Character Set". At that time, the UCS code point range was
    > up to 31 bits wide.

    Well, I did recollect there was a time, probably later than Philippe's
    description, perhaps around 1997, where the notation U+xxxx intended to
    designate 16-bit units (whether it was character or code [value] I cannot
    say), while U-xxxxxxxx intended to designate 32-bit units.

    I even found that ISO/IEC 10646-1:2000 might say so in subclause 6.5,
    according to someone writing as "Ken Whistler" in
    http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML021/0842.html,
    at the beginning of the post. Worth reading, since it mentions others
    notations that might be standard (then) but were pretty unused, as it seems.

    I also remember asking about the introduction of the U+xxxxx and U+10xxxx
    notation, perhaps in year 2000, and to be so confirmed by Dr. Whistler;
    unfortunately my file archives are pretty bad, and I cannot found the post
    right now (well, the interessant one here is Ken's answer, not mine); I did
    not even remember if it was on this list, silly me.

    Antoine



    This archive was generated by hypermail 2.1.5 : Tue Nov 08 2005 - 12:28:44 CST