From: James E. Agenbroad (jage@loc.gov)
Date: Fri Nov 08 2002 - 16:03:38 EST
On Fri, 8 Nov 2002, Magda Danish (Unicode) wrote:
>
>
> > -----Original Message-----
> >
> > Date/Time: Fri Nov 8 09:05:40 EST 2002
> > Contact: mrmagnusrosenberg@hotmail.com
> > Report Type: Other Question, Problem, or Feedback
> >
> > Hello
> >
> > I just wanted to know how much space in bytes the Latin-1
> > characters such as the german umlaut characters take up in
> > UTF-8 encoding. Is it still just one byte or does it now
> > require 2 bytes?
> >
> > Regards,
> >
> > Magnus Rosenberg
> >
> > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> > (End of Report)
> >
> >
>
>
Friday, November 8, 2002
Mr. Rosenberg,
Without delving into the issues of separately encoded combining
characters vs. precomposed combinations I think the short answer is that
in UTF-8 all Unicode characters except those with ASCII codes 00 to 7F
are two or more bytes long. If I'm wrong others will corret me.
Regards,
Jim Agenbroad ( jage@LOC.gov )
"It is not true that people stop pursuing their dreams because they
grow old, they grow old because they stop pursuing their dreams." Adapted
from a letter by Gabriel Garcia Marquez.
The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
Addresses: Office: Phone: 202 707-9612; Fax: 202 707-0955; US
mail: I.T.S. Sys.Dev.Gp.4, Library of Congress, 101 Independence Ave. SE,
Washington, D.C. 20540-9334 U.S.A.
Home: Phone: 301 946-7326; US mail: Box 291, Garrett Park, MD 20896.
This archive was generated by hypermail 2.1.5 : Fri Nov 08 2002 - 17:06:39 EST