From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Mar 11 2003 - 11:48:02 EST
Kenneth Whistler wrote:
> "Unicode character (\uFFE2\uFF80\uFF93)"
> ...
> What you are actually looking for is the UTF-8 sequence:
>
> 0xE2 0x80 0x93
The 8-bit UTF-8 bytes E2 80 93 (all with the most significant bit set) get *sign-extended* to 16
bits, producing FFE2 FF80 FF93. It should suffice in a UTF-8 string literal to rewrite this as
\xE2\x80\x93. Otherwise, find out where the 16-bit-widening/sign-extension occurs.
markus
This archive was generated by hypermail 2.1.5 : Tue Mar 11 2003 - 12:34:51 EST