[FAQ] How can I convert ISO 8859-1 to UTF-8?

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Mar 01 2000 - 13:19:32 EST


Q: How can I convert ISO 8859-1 characters to UTF-8 byte sequences without
implementing the full UTF-16 to UTF-8 algorithm?

A: Use the following table (or its trivial algorithmic equivalent). This method
will not work unless you are sure the input characters are really 8859-1.

        8859-1 characters UTF-8 bytes

        00 to 7F 00 to 7F
        80 to BF C2 80 to C2 BF
        C0 to FF C3 80 to C3 BF

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT