RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n)

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Dec 18 2001 - 20:37:09 EST


Rick continued:

> OK, so it is there in 3.0. But in the section on Surrogates? And on
> Transformations? A little obscure.

But you need to keep in mind that Chapter 3 is the Conformance chapter,
the key part of the formal definition of the standard.
 
>
> I expected to find it in section 2.3, for example, where the major encoding
> forms are being described; or even earlier - say in 1.1 Coverage. Surely the
> range of valid scalar values is an important aspect of coverage!

It will be. Here are some sneak peeks at the current draft for the
new Section 2.5 Encoding Forms, for Unicode 4.0:

"In the Unicode Standard, the codespace consists of the integers from
0 to 10FFFF<sub>16</sub>, comprising 1,114,112 code points available for assigning
the repertoire of abstract characters...."

"As for all of the Unicode encoding forms, UTF-32 is restricted to
representation of code points in the range 0..10FFFF<sub>16</sub>,
that is, the Unicode codespace...."

"In the UTF-16 encoding form, code points in the range U+0000..U+FFFF
are represented as a single 16-bit code unit; code points in the
supplementary planes, in the range U+10000..U+10FFFF, are instead
represented as pairs of 16-bit code units...."

> I hope this aspect of the standard will be front and centre in 4.0.

Is that front and centre enough for you?

--Ken

P.S. the 1.1 Coverage section is intended to deal (briefly) with what scripts
and types of characters are covered by the standard, and what other
standards are covered by the standard -- not codespace structure
or encoding forms.



This archive was generated by hypermail 2.1.2 : Tue Dec 18 2001 - 20:05:47 EST