Re: Perhaps OT: Mysterious escape sequences in UN data

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Mar 31 2009 - 19:08:36 CST


> On Mar 31, 2009, at 12:58 PM, John Burger wrote:
>
> > For instance 5ee5 seems to be "és":
> >
>
> One possible way for that to happen: Latin-1 és is represented by
> the bytes E9 73. Read as Big5, it becomes E973 廥. The Unicode
> point for that character is 5EE5.

I think we have a winner! Ding! Ding! Ding!

aide-m\x{5e66}oire

U+5E66 = Big5 0xE96D = mashed up Latin-1 0xE9 0x6D = ém

And:

Rodr\x{74b2}uez

U+74B2 = Big5 0xED67 = mashed up Latin-1 0xED 0x67 = íg

--Ken



This archive was generated by hypermail 2.1.5 : Tue Mar 31 2009 - 19:10:58 CST