I would like to access some of the characters from "CJK Unified Ideographs
Extension B." These are all in the range of 20000-2A6DF. (direct link:
http://www.unicode.org/charts/PDF/U20000.pdf )
"Basic Latin" appears in 0000-007F range. The original "CJK Unified
Ideographs" all appear within the 4E00–9FAF range. These are all easy to
access with U+xxxx (4 x's). In Java, the format /uxxxx works just fine (and
also the same for http://www.macchiato.com/unicode/ ). However, how do you
access the characters in the larger ranges (ie, U+xxxxx or /uxxxxx)?
Directly using the 5 value format /uxxxxx produces are Unicode character
followed by the 5th x. Here is a quick example:
public class UniStringTest {
static public void main(String[] args) {
String s1 = "\u963F"; // displays fine; standard /uxxxx (4x's)
System.out.println(s1);
String s2 = "\u9FA0"; // also displays fine; standard /uxxxx (4x's)
System.out.println(s2);
String s3 = "\u2A6A5"; // biggest character that I know (5x's) but
doesn't process
System.out.println(s3);
}
}
I understand this isn't a programming ML, but I just used the Java program
as an example.
I'd appreciate some input.
Thanks,
Ben Monroe
This archive was generated by hypermail 2.1.2 : Tue Mar 26 2002 - 04:28:19 EST