Re: converting ISO 8859-1 character set text to ASCII (128)charactet set

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Wed Jun 20 2001 - 20:49:26 EDT


cls raj wrote:
> We have a specific requirment of converting Latin -1 character set ( iso
> 8859-1 ) text to ASCII charactet set ( a set of only 128 characters).

8859-1 is a superset of ASCII (of US-ASCII, to be precise, but you seem to be using that).
US-ASCII uses byte values 0..127 (7 bits), while ISO 8859-1 uses byte values 0..255 (8 bits).

This means that what you need to do depends on your input data and on your error handling requirements.
If your input data only contains ASCII characters, then you do not need to do anything.
If you can safely ignore non-ASCII characters, then you do not need to do anything.
If your input data may contain non-ASCII characters and you cannot safely ignore them, then you need to check for them (byte value>=128) and do something interesting (set an error, throw an exception, replace with some safe character, remove from the string, ...).

markus



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT