Re: Unicode conversion to normal asci file

From: Frank da Cruz (fdc@columbia.edu)
Date: Wed Jun 06 2007 - 13:26:30 CDT

  • Next message: William J Poser: "Re: Unicode conversion to normal asci file"

    Dan Saltel wrote:
    > Hi,
    > I want a program that will read in a unicode file, convert it to asci so
    > that we can upload it into our database.
    > Example:
    > If the input file looks like:
    > Último año de carrera
    > It would convert it to
    > ultimo ano de carrera
    >
    > Do I have to write a program to do this? Or are there utilities or
    > procedures that will do this for me?
    >
    In this case, converting ISO 8859-1 to ASCII by "removing accents".
    I'm sure there are many options, as well as much opinion against doing
    it all... But one option, often overlooked, that does exactly what you
    are asking is Kermit software:

      http://www.columbia.edu/kermit/

    which can convert character sets as part of the file transfer process
    (upload or download), or also convert files locally without transferring
    them. Example with file transfer:

      On the sending side:
        set file character-set latin1
        set transfer character-set latin1
        send name-of-file

      On the receiving side:
        set file character-set ascii
        receive

    Example of converting a local file:

      translate name-of-file latin1 ascii name-of-result-file

    Kermit can convert between any two character sets that it supports,
    including UTF-8, UCS-2, the ISO 8859 alphabets, the ISO 646 7-bit "national
    replacement" sets of yore, numerous PC (DOS) code pages and Windows code
    pages, and other proprietary and standard character sets. Of course the
    result doesn't always make sense; for example, translating Cyrillic into
    ASCII, although even in this case there is a special case to do the
    replacements "by sound".

    The Kermit software that does this is available for Unix (all versions),
    Windows, DOS, VMS, MVS, VM/CMS, and various other operating systems.

    Frank da Cruz
    Columbia University



    This archive was generated by hypermail 2.1.5 : Wed Jun 06 2007 - 13:28:43 CDT