Re: converting ISO 8859-1 character set text to ASCII (128)charactet set

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jun 20 2001 - 20:35:15 EDT


> We have a specific requirment of converting Latin -1 character set ( iso
> 8859-1 ) text to ASCII charactet set ( a set of only 128 characters). Is
> there any special set of utilities available or service providers who can do
> that type of job.

Well, if you only want exact character conversion, then try:

for ( s = input, d = output; *s != '\0'; )
{
    if ( *s <= 0x7F )
    {
        *d++ = *s++;
    }
    else
    {
        *d++ = '?';
        s++;
    }
    *d = '\0';
}

If you want fallback conversion of the other Latin-1 characters to
something comparable in ASCII (as for example, by removing their
accents), then for the else branch, just make yourself a little
128 element table (actually just 96 elements, since the only values
you would care about are 0xA0..0xFF), and convert the accented
a's to 'a', the accented e's to 'e', etc. Should take an hour or
so to program and debug.

Or, if you want a big, fat, generic, all the bells and whistles
conversion package with hundreds of conversions and lots of
fallback options, then go look at the ICU link in the useful
resources area on the Unicode website.

But it will take you longer to download and install ICU, let alone
build and link it, than it will to program your answer directly.

--Ken

>
> It is kind of critical for my current project, I would appreciate if I have
> some quick HELP for this.
>
> Thanks
>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT