Re: LC_CTYPE locale category and character sets.

From: Michael Everson (everson@indigo.ie)
Date: Thu Jul 16 1998 - 14:38:26 EDT


Ar 09:13 -0700 1998-07-16, scríobh Kenneth Whistler:

>Case-mappings between characters have a few well-known, culturally-specific
>preferences that must be accounted for. But case-mappings are *relations*
>between pairs (or triplets) of characters, and not character properties
>per se.

>> Does anyone has a good example of how to handle correctly the german
>> LATIN SMALL LETTER SHARP S (00DF)
>> 'to uppercase' conversion , which sould give two letters : "SS" ?
>
>Mark Davis pointed at the Unicode Standard for the full answer.
>
>The short answer is that the Unicode Character Database (and you
>should be using Version 2.1.2 now) gives all the default one-to-one
>case mappings. Some case mappings (e.g., for French and for Turkish)
>differ from the defaults.

French?

>And U+00DF for German has the uppercase "SS",
>but "SS" does not generally lowercase to U+00DF (unless you do
>context analysis on the data).

Which is especially unreliable now that the German High Court has approved
the spelling reform.

How do you do reversible conversions from lowercase to uppercase and back,
though? Or is that "outside the scope" of coding in your view?

--
Michael Everson, Everson Gunn Teoranta ** http://www.indigo.ie/egt
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Guthán: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by arrangement)
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT