From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Dec 19 2003 - 06:12:19 EST
Hallvard B Furuseth wrote:
> I need a function which converts Latin Unicode characters to the closest
> equivalent ASCII characters, e.g. "é" -> "e".
>
> Before I reinvent the wheel, does any public domain or GPL code for this
> already exist?
Please don't use character names for that conversion:
instead use the NFKD decompositions from the UCD, then see if the first
character is an ASCII character, and if so, remove diacritics in the 03xx
block (that have a "Mn" general category and a non-zero
combining class). If there remains non ASCII characters use a default
replacement like '?'. But you need some other custom rules:
(look at sharp-s compatibility decomposition: it's best to
map it to "ss" rather than "?", whch can be done by looking at
casefoldings of "Ll" lowercase letters)
This will be less tricky, as there's no guarantee that names will be
consistent
__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
This archive was generated by hypermail 2.1.5 : Fri Dec 19 2003 - 06:55:03 EST