From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Tue Nov 08 2005 - 12:25:24 CST
On Tuesday, November 8th, 2005 14:04Z, Philippe Verdy va escriure:
>
> U-nnnn already exists (or I should say, it has existed). It was
> refering to 16-bit code units, not really to characters and was a
> fixed-width notation (with 4 hexadecimal digits). The "U" meant
> "Unicode" (1.0 and before).
>
> U+[n...n]nnnn was created to avoid the confusion with the past 16-bit
> only Unicode 1.0 standard (which was not fully compatible with
> ISO/IEC 10646 code points). It is a variable-width notation that
> refers to ISO/IEC 10646 code points. The "U" means "UCS" or
> "Universal Character Set". At that time, the UCS code point range was
> up to 31 bits wide.
Well, I did recollect there was a time, probably later than Philippe's
description, perhaps around 1997, where the notation U+xxxx intended to
designate 16-bit units (whether it was character or code [value] I cannot
say), while U-xxxxxxxx intended to designate 32-bit units.
I even found that ISO/IEC 10646-1:2000 might say so in subclause 6.5,
according to someone writing as "Ken Whistler" in
http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML021/0842.html,
at the beginning of the post. Worth reading, since it mentions others
notations that might be standard (then) but were pretty unused, as it seems.
I also remember asking about the introduction of the U+xxxxx and U+10xxxx
notation, perhaps in year 2000, and to be so confirmed by Dr. Whistler;
unfortunately my file archives are pretty bad, and I cannot found the post
right now (well, the interessant one here is Ken's answer, not mine); I did
not even remember if it was on this list, silly me.
Antoine
This archive was generated by hypermail 2.1.5 : Tue Nov 08 2005 - 12:28:44 CST