Re: XML and ISO 10646 planes beyond the BMP

From: Misha Wolf (misha.wolf@reuters.com)
Date: Sat Aug 16 1997 - 19:01:51 EDT


Though SGML's 8-digit limit may be under review, I don't think we can wait
for that process to run to completion, and so believe we must treat it as
an absolute constraint. In that case, we have a choice of two numbers for
the slot after the "160" below:

   CHARSET
            BASESET "ISO Registration Number 177//CHARSET
                     ISO/IEC 10646-1:1993 UCS-4 with
                     implementation level 3//ESC 2/5 2/15 4/6"
            DESCSET 0 9 UNUSED
                     9 2 9
                     11 2 UNUSED
                     13 1 13
                     14 18 UNUSED
                     32 95 32
                     127 1 UNUSED
                     128 32 UNUSED
                     160 160
                         ^^^^^^^^
They are:

   99999999, the highest integer that may be expressed using eight
             decimal digits, and

   1113952, which allows us to utilise the entire available range
             of code points defined by The Unicode Standard. This
             number is derived as follows:

                (256 * 256 * 17) - 160 = 1113952

             where 17 is the number of 64K planes defined by Unicode.

I don't have any strong preferences one way or the other. Comments?

BTW, the value 99999999 may not be absolutely accurate. It may be that
it has to be reduced to:

   99999999 - 160 = 99999839

so that the highest numeric character reference (NCR) does not exceed
99999999. Please could one of the SGML experts advise on this.

------------------------------------------------------------------------
Misha Wolf Email: misha.wolf@reuters.com 85 Fleet Street
Standards Manager Voice: +44 171 542 6722 London EC4P 4AJ
Reuters Limited Fax : +44 171 542 8314 UK
------------------------------------------------------------------------
Eleventh International Unicode Conference, Sep 2-5 1997, www.unicode.org

------------------------------------------------------------------------
Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
Reuters Ltd.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT