From: Arne Götje (高盛華) (arne@linux.org.tw)
Date: Tue Jun 13 2006 - 20:14:37 CDT
On Wednesday 14 June 2006 00:47, John H. Jenkins wrote:
> On Jun 13, 2006, at 8:21 AM, Werner LEMBERG wrote:
> >> CNS11643 defined 16 plane space in 1986 but only define the first
> >> two planes.  Plane 14-16 was uesd during late 80 and earily 90 in
> >> the EUC-TW system as experiemental or "private usage" in some
> >> goverment system.  The 1992 version define 7 planes.
> >
> > Until now everybody has said that CNS 11643 from 1992 has only
> > seven planes.  So the question is still unanswered what plane 15 in
> > Unihan.txt actually refers to.
>
> Note that plane 15 occurs in the kIRG_TSource.  The official IRG
> sources on occasion differ from their printed counterparts, typically
> to allow inclusion in Unicode of characters which have not yet been
> officially standardized in their home country.
>
> What "plane 15" means in this case is that the Taipei Computer
> Association, which owns CNS 11643, handed the IRG a set of mappings
> which contained this mysterious plane.  That's as much an answer as
> you can get from the Unicode end of things.  For more information,
> you'll have to contact TCA.
>
> Note that the kCNS1992 field does *not* contain plane 15 mappings.
From the CNS11643 website: 
http://61.60.106.73/eng/word.jsp#cns11643
-------------------- snip -------------------------
(2)  	User-defined Areas
        To cater for different types of Chinese information processing, 
CNS11643 has reserved character plane 12 to 15 for user-defined 
characters. Chinese characters or symbols that have yet to be 
classified as national standard characters are coded in this area based 
on user requirements.
p to 48,027 Chinese characters are encoded in the amended and extended 
version of CN11643. The code has covered characters as defined in the 
four "Table of Standard Chinese Characters" namely in the categories of 
frequently used, less frequently used, rarely used and Chinese 
character variants. However, since the implementation of the on-line 
computerized Residency Information System, the characters used to 
construct the national population database have exceeded the national 
standard characters by some 30,000 characters used for names. To enable 
data transmission and interchange for this type of character codes, the 
EDPC, Executive Yuan temporarily defined the interchange codes in 
user-defined areas: Character Plane 15: Coding interval from 2121 to 
6D39 is encoded with 6,831 Chinese characters. Ideographs are sourced 
from the 15th character plane of the Residency Information System. EUC 
codes are used in the Residency Information System and the encoding 
principles of EUC codes are identical to those of CNS11643. For easier 
understanding, existing ideographs and definitions are used. However, 
amongst the 7,167 characters defined in character plane 15 of the 
Residency Information System, there are 2 self-repeating characters and 
336 repeated characters that were already included in the first 7 CNS 
character planes. To avoid the situation of "one word, two codes", 
repeated parts are deleted to save the Household Registration and 
Military Service departments from having to repetitively convert codes; 
the spaces originally occupied by repeated characters are left blank 
after deletion.
-------------------- snip -------------------------
BTW: CJK Extension C will contain characters from Plane 12-15 as well as 
some missing ones from Plane 3 AFAIR.
Maybe after the release of CJK Extension C there will be a 
new "official" version of CNS 11643...
Cheers
Arne
-- Arne Götje (高盛華) <arne@linux.org.tw> PGP/GnuPG key: 1024D/685D1E8C Fingerprint: 2056 F6B7 DEA8 B478 311F 1C34 6E9F D06E 685D 1E8C Key available at wwwkeys.pgp.net. Encrypted e-mail preferred.
This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 02:32:02 CDT