From: Eugene Mandel (kipodrach@hotmail.com)
Date: Thu May 15 2003 - 16:41:37 EDT
Pim,
You are right. That's exactly what's going on.
Thank you very much
-Eugene
----- Original Message -----
From: "Pim Blokland" <pblokland@planet.nl>
To: "Unicode mailing list" <unicode@unicode.org>
Sent: Thursday, May 15, 2003 2:47 AM
Subject: Re: weird UTF-8 encoding in MS Exchange 2000 IM client
> Eugene Mandel schreef:
>
> > Is it possible that there several flavors of UTF-8?
>
> No!
>
> > e.g. Hebrew letter "alef" (05 D0) is "D7 90" in UTF-8
> > encoding, but "C3 97 C2 90" in this encoding.
>
> Sounds like what you are looking at is the bytes being interpreted
> twice.
> C3 97 is UTF-8 for U+00D7 and C2 90 is UTF-8 for U+0090.
> Apparently, the method you use to look at the source assumes that
> the encoding is iso-8859 or similar instead of UTF-8, and wants to
> show you what that would look like if translated to UTF-8.
> Does that make sense?
>
> Pim Blokland
>
>
>
This archive was generated by hypermail 2.1.5 : Thu May 15 2003 - 17:30:30 EDT