RE: converting ISO 8859-1 character set text to ASCII (128)chara ctet set

From: Yves Arrouye (yves@realnames.com)
Date: Wed Jun 20 2001 - 20:28:35 EDT


> We have a specific requirment of converting Latin -1
> character set ( iso
> 8859-1 ) text to ASCII charactet set ( a set of only 128
> characters). Is
> there any special set of utilities available or service
> providers who can do
> that type of job.

[I am assuming that your "ascii" table is the ASCII everybody use, not some
variation of it.]

If you do not care about the loss of information at all, just truncate the
data to 7 bits. You can write a trivially simple program for that, or use
your platform's conversion tools or routines (cf. iconv(1) and iconv(3) on
UNIX 98 platforms, uconv from ICU's contributed applications at
http://oss.software.ibm.com/icu/, or the WIN32 conversion APIs whose name I
forgot).

If you want to minimize the loss, you may want to use fallbacks so that for
example you will lose diacritics on letters but will retain the base letter.
Giving you things like "mon bebe a tete tout l'ete" for French. I am sure
the WIN32 APIs will let you do that, iconv doesn't support it, and I am not
sure about whether the ICU ASCII converter has fallbacks (some of their
converters do, some don't; but thus may be outdated info).

Hope this helps,
YA



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT