Keld scripsit:
> You should not use \uxxxx nothation for surrogates,
> as surrogates are not charcters in neither Unicode nor 10646,
> and thus the short identifiers cannot be used.
In Java, the sequence '\uxxxx' where xxxx is precisely 4
hex digits represents a datum of the Java type "char",
a numeric value ranging from 0 to 65535. Java as such does not
understand surrogates, though Java applications may.
Therefore, "\ud800\ude08" is a Java String containing two chars.
Java chars = Unicode characters are not the same as
Unicode abstract characters = 10646 characters.
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT