Re: New on list

From: Brendan_Murray/DUB/Lotus@lotus.com
Date: Tue Jan 12 1999 - 15:03:32 EST


Charles Alfinito wrote:

> Unicode is presenting a problem. For example, a ~ may be the character in a
> file. Normally in RTF this would be shown as \'98. Recently I had a file
> with the unicode, \u8776\'98. This character should have been an
> "infinity". Since my program can't handle the Unicode RTF (\u8776) it
> ignores it and changes the \'98 to a ~ which obviously is wrong.

Your RTF parser should accept the token pair \uxxxx\'yy as one: these refer to
one
character, where the "xxxx" is the decimal value of the Unicode character, and
the
"yy" is the fallback character (approximation) in your current code page. In
this
case, you have 8776 = U+2248 (ALMOST EQUAL TO), and the approximation
is the TILDE. If you want to be accurate, take the Unicode value; if you only
want
a quick and dirty solution, use the fallback character.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT