Re: weird UTF-8 encoding in MS Exchange 2000 IM client

From: Pim Blokland (pblokland@planet.nl)
Date: Thu May 15 2003 - 05:47:11 EDT

Next message: Andrew C. West: "RE: how to sort by stroke (not radical/stroke)"

Previous message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"
In reply to: Eugene Mandel: "weird UTF-8 encoding in MS Exchange 2000 IM client"
Next in thread: Eugene Mandel: "Re: weird UTF-8 encoding in MS Exchange 2000 IM client"
Reply: Eugene Mandel: "Re: weird UTF-8 encoding in MS Exchange 2000 IM client"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Eugene Mandel schreef:

> Is it possible that there several flavors of UTF-8?

No!

> e.g. Hebrew letter "alef" (05 D0) is "D7 90" in UTF-8
> encoding, but "C3 97 C2 90" in this encoding.

Sounds like what you are looking at is the bytes being interpreted
twice.
C3 97 is UTF-8 for U+00D7 and C2 90 is UTF-8 for U+0090.
Apparently, the method you use to look at the source assumes that
the encoding is iso-8859 or similar instead of UTF-8, and wants to
show you what that would look like if translated to UTF-8.
Does that make sense?

Pim Blokland

Next message: Andrew C. West: "RE: how to sort by stroke (not radical/stroke)"
Previous message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"
In reply to: Eugene Mandel: "weird UTF-8 encoding in MS Exchange 2000 IM client"
Next in thread: Eugene Mandel: "Re: weird UTF-8 encoding in MS Exchange 2000 IM client"
Reply: Eugene Mandel: "Re: weird UTF-8 encoding in MS Exchange 2000 IM client"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu May 15 2003 - 06:30:45 EDT