Re: Largest character

From: Gary Bonham (Gary@BonhamDesigns.com)
Date: Fri Mar 31 2000 - 17:18:19 EST

Next message: Tony Harminc: "&#61623 ?"
Previous message: laksri@us.ibm.com: "Tamil Glyphs - Clarified"
Maybe in reply to: Sarasvati: "Largest character"
Next in thread: Kenneth Whistler: "Re: Largest character"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Characters <128 take one byte.
Characters <2048 take two bytes.
All others in the 64K normal range take three bytes each.
There are provisions for characters above 2,097,152 to use three bytes, but
normally unicode is only up to 64K. However when additional space is used,
it still uses two bytes up to the 2,097,152 point.
Using Character Agent from Bjondi, we find that 2048 (hex 0800) is in the
middle of the arabic characters. Below this are things like hebrew,
armenian, cyrillic, greek, and some other misc stuff. All the asian sets are
above this point.

----- Original Message -----
From: "Sarasvati" <root@unicode.org>
To: "Unicode List" <unicode@unicode.org>
Sent: Friday, March 31, 2000 9:57 AM
Subject: Largest character

> Forwarding for Samir...
>
> > Subject: Largest character
> > Date: Fri, 31 Mar 2000 10:16:33 +0530
> >
> > Hi,
> > Which are those languages whose characters requires maximum number
of
> > bytes to store using UTF 8?
> >
> > - Samir Mehrotra,
>
>
>

Next message: Tony Harminc: "&#61623 ?"
Previous message: laksri@us.ibm.com: "Tamil Glyphs - Clarified"
Maybe in reply to: Sarasvati: "Largest character"
Next in thread: Kenneth Whistler: "Re: Largest character"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT