Re: Unicode character transformation through XSLT

From: Markus Scherer ([email protected])
Date: Tue Mar 11 2003 - 11:48:02 EST

Next message: John Hudson: "Re: Ligatures (qj)"

Previous message: Marco Cimarosti: "RE: Encoding: Unicode Quarterly Newsletter"
In reply to: Kenneth Whistler: "Re: Unicode character transformation through XSLT"
Next in thread: Jain, Pankaj (MED, TCS): "RE: Unicode character transformation through XSLT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Kenneth Whistler wrote:
> "Unicode character (\uFFE2\uFF80\uFF93)"
> ...
> What you are actually looking for is the UTF-8 sequence:
>
> 0xE2 0x80 0x93

The 8-bit UTF-8 bytes E2 80 93 (all with the most significant bit set) get *sign-extended* to 16
bits, producing FFE2 FF80 FF93. It should suffice in a UTF-8 string literal to rewrite this as
\xE2\x80\x93. Otherwise, find out where the 16-bit-widening/sign-extension occurs.

markus

Next message: John Hudson: "Re: Ligatures (qj)"
Previous message: Marco Cimarosti: "RE: Encoding: Unicode Quarterly Newsletter"
In reply to: Kenneth Whistler: "Re: Unicode character transformation through XSLT"
Next in thread: Jain, Pankaj (MED, TCS): "RE: Unicode character transformation through XSLT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Mar 11 2003 - 12:34:51 EST