Re: Longest Names (was: Re: Unicode trivia)

From: A. Vine (avine@eng.sun.com)
Date: Tue May 09 2000 - 15:29:47 EDT


Kenneth Whistler wrote:
>
> John asked:
>
> >
> > On Sat, 6 May 2000 08:09:49 -0800 (GMT-0800), Doug Ewell wrote:
> > > Recently while writing a C program that reads UnicodeData.txt, I needed
> > > to determine the longest character name. The winner (83 characters):
> > >
> > > U+FBF9 ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF
> > > MAKSURA ISOLATED FORM
> >
> > Out of curiosity, and perhaps of more importance than the current
> > longest name, is there a specified length limit which names are
> > guaranteed not to exceed?
>
> My own rule of thumb for processing UnicodeData.txt is to use 128 bytes
> for transient buffers for names -- which gives me a 99.999% confidence
> feeling that future versions of the data file will never break it.

And that would mean 64 "characters" in UCS-2/UTF-16 ...
So, I presume you are referring to UTF-8?

(Couldn't resist that one, Ken...)

-- 
Andrea Vine, avine@eng.sun.com, iPlanet i18n architect
"The complementarity of priority information actions will reinforce 
individual projects and, in particular, those relating to the Euro."
--From the "Information Programme for the European Citizen"



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT