Re: accessing extended ranges

From: Syn Wee (syn.wee.quek@jtcsv.com)
Date: Wed Apr 03 2002 - 13:50:52 EST


> (Late for this thread.)
>
> ICU4J comes with its own "UCharacter" class that provides Unicode 3.1.1
properties for all code points, using int for the single-character type.
> A class library can of course not fix the problem of string literals with
\u - we either use two \u's for surrogate pairs or an unescape function (I
think on the UTF16 class) that understands \U.
>
> http://oss.software.ibm.com/icu4j/
>
> markus

Hi,

The functionality of an unescape function is located in ICU4J's Utility
class.
Note that the API unescape(String) fixed the number of characters to be used
with \u and \U to size 4 and 8 respectively.
ie. The codepoint 2A6A5 will have to be padded to \U0002A6A5, and codepoint
Latin A padded to \u0041 when using with unescape(String).
To use UTF16 with supplementary characters, you can convert them to String
via either the API append(StringBuffer, int) or toString(int).



This archive was generated by hypermail 2.1.2 : Wed Apr 03 2002 - 15:11:28 EST