RE: LC_CTYPE locale category and character sets.

From: Addison Phillips (AddisonP@simultrans.com)
Date: Thu Jul 16 1998 - 15:50:54 EDT


French because some French locales have the usage that the accent be
removed from the letter on toupper (e.g.LATIN SMALL LETTER A WITH GRAVE
becomes LATIN CAPITAL LETTER A).

----------------------------------------------
Addison P. Phillips
Director, Technology
SimulTrans, LLC

+1 (650) 526-4652 (telephone)
+1 (650) 969-9959 (telefax)
AddisonP@simultrans.com <mailto:AddisonP@simultrans.com>
http://www.simultrans.com <http://www.simultrans.com>

"22 languages. One release date."
----------------------------------------------

                -----Original Message-----
                From: Michael Everson [mailto:everson@indigo.ie]
                Sent: Thursday, July 16, 1998 11:38 AM
                To: Unicode List
                Subject: Re: LC_CTYPE locale category and
character sets.

                Ar 09:13 -0700 1998-07-16, scríobh Kenneth Whistler:

>Case-mappings between characters have a few well-known,
culturally-specific
>preferences that must be accounted for. But
case-mappings are *relations*
>between pairs (or triplets) of characters, and not
character properties
>per se.

>> Does anyone has a good example of how to handle
correctly the german
>> LATIN SMALL LETTER SHARP S (00DF)
>> 'to uppercase' conversion , which sould give two
letters : "SS" ?
>
>Mark Davis pointed at the Unicode Standard for the full
answer.
>
>The short answer is that the Unicode Character Database
(and you
>should be using Version 2.1.2 now) gives all the
default one-to-one
>case mappings. Some case mappings (e.g., for French and
for Turkish)
>differ from the defaults.

                French?

>And U+00DF for German has the uppercase "SS",
>but "SS" does not generally lowercase to U+00DF (unless
you do
>context analysis on the data).

                Which is especially unreliable now that the German High
Court has approved
                the spelling reform.

                How do you do reversible conversions from lowercase to
uppercase and back,
                though? Or is that "outside the scope" of coding in your
view?

                --
                Michael Everson, Everson Gunn Teoranta **
http://www.indigo.ie/egt
                15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2;
Éire/Ireland
                Guthán: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by
arrangement)
                27 Páirc an Fhéithlinn; Baile an Bhóthair; Co. Átha
Cliath; Éire
                



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT