[unicode] Re: removing compromises from unicode ("WCode")

From: James E. Agenbroad (jage@loc.gov)
Date: Mon Mar 26 2001 - 11:17:11 EST


On Fri, 23 Mar 2001, Jonathan Coxhead wrote:

>
> It would be very entertaining to do the same job with the ideographs (down
> to the radical level) and count the number of atoms. I suspect the resulting
> "character set" would contain less than 2000 atoms altogether.
>
> Please do feel free to share any thoughts on the "Atomic Theory" with me!
>
> /|
> o o o (_|/
> /|
> (_/
>
>
                                            Monday, March 26, 2001
Mr. Coxhead,
     I am far from an expert on Chinese characters, but I suspect that
decomposing ideographs down to their radicals would sometimes require some
means of indicating the relative position of the component radicals.
('Ideographic description characters, U+2FF0 to U+2FFB, described at
pages 268-271 of 3.0 are one such means.) The 'code the strokes' approach
has the same difficulty but with greater frequency. Both also assume some
means to indicate the end of a character. These approaches or variatiants
of them have been used as means of character input where, after a person
resolves ambiguous cases, a unique code for the whole character is stored
and transmtted.
     Regards,
          Jim Agenbroad ( jage@LOC.gov )
     The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library
of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:15 EDT