Since a fully correct DESCSET declaration for the codepoints of
ISO 10646 as amended would require in excess of 65536 declarations,
I propose the following compromise for XML and HTML. Members of
those mailing lists are urged to pass this suggestion on.
-- This DESCSET for UCS-4 fails to record that FFFE and FFFF on
the non-UTF-16 planes are also non-characters. To
do so would require 65502 more lines. --
DESCSET
0 9 UNUSED -- C0 space --
9 2 9 -- TAB, LF --
11 2 UNUSED -- C0 space --
13 1 13 -- CR --
14 18 UNUSED -- C0 space --
32 95 32 -- ASCII --
127 33 UNUSED -- DEL, C1 space --
160 55136 160 -- BMP --
55296 2048 UNUSED -- Surrogates --
57344 8190 57344 -- BMP --
65534 2 UNUSED -- FFFE, FFFF --
65536 65534 65536 -- Plane 1 --
131070 2 UNUSED -- 1FFFE, 1FFFF --
131072 65534 131072 -- Plane 2 --
196606 2 UNUSED -- 2FFFE, 2FFFF --
196608 65534 65536 -- Plane 3 --
262142 2 UNUSED -- 3FFFE, 3FFFF --
262144 65534 262144 -- Plane 4 --
327678 2 UNUSED -- 4FFFE, 4FFFF --
327680 65534 327680 -- Plane 5 --
393214 2 UNUSED -- 5FFFE, 5FFFF --
393216 65534 393216 -- Plane 6 --
458750 2 UNUSED -- 6FFFE, 6FFFF --
458752 65534 458752 -- Plane 7 --
524286 2 UNUSED -- 7FFFE, 7FFFF --
524288 65534 524288 -- Plane 8 --
589822 2 UNUSED -- FFFE, FFFF --
589824 65534 589824 -- Plane 9 --
655358 2 UNUSED -- FFFE, FFFF --
655360 65534 655360 -- Plane A --
720894 2 UNUSED -- FFFE, FFFF --
720896 65534 720896 -- Plane B --
786430 2 UNUSED -- FFFE, FFFF --
786432 65534 786432 -- Plane C --
851966 2 UNUSED -- FFFE, FFFF --
851968 65534 851968 -- Plane D --
917502 2 UNUSED -- FFFE, FFFF --
917504 65534 917504 -- Plane E --
983038 2 UNUSED -- FFFE, FFFF --
983040 65534 983040 -- Plane F --
1048574 2 UNUSED -- FFFE, FFFF --
1048576 65534 1048576 -- Plane 10 --
1114110 2 UNUSED -- FFFE, FFFF --
1114112 2146369534 1114112 -- All other planes to 7FFF FFFD --
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org e'osai ko sarji la lojban
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT