Re: weird UTF-8 encoding in MS Exchange 2000 IM client

From: Eugene Mandel (kipodrach@hotmail.com)
Date: Thu May 15 2003 - 16:41:37 EDT

  • Next message: Michael Everson: "Re: Dania in Unicode?"

    Pim,

    You are right. That's exactly what's going on.
    Thank you very much

    -Eugene

    ----- Original Message -----
    From: "Pim Blokland" <pblokland@planet.nl>
    To: "Unicode mailing list" <unicode@unicode.org>
    Sent: Thursday, May 15, 2003 2:47 AM
    Subject: Re: weird UTF-8 encoding in MS Exchange 2000 IM client

    > Eugene Mandel schreef:
    >
    > > Is it possible that there several flavors of UTF-8?
    >
    > No!
    >
    > > e.g. Hebrew letter "alef" (05 D0) is "D7 90" in UTF-8
    > > encoding, but "C3 97 C2 90" in this encoding.
    >
    > Sounds like what you are looking at is the bytes being interpreted
    > twice.
    > C3 97 is UTF-8 for U+00D7 and C2 90 is UTF-8 for U+0090.
    > Apparently, the method you use to look at the source assumes that
    > the encoding is iso-8859 or similar instead of UTF-8, and wants to
    > show you what that would look like if translated to UTF-8.
    > Does that make sense?
    >
    > Pim Blokland
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu May 15 2003 - 17:30:30 EDT