RE: How will software source code represent 21 bit unicode charac ters?

From: Mike Brown (mbrown@webb.net)
Date: Mon Apr 23 2001 - 13:28:51 EDT

Next message: Rick McGowan: "Re: On the possibility of guidance code points for the PrivateUse Area"
Previous message: Michael Everson: "Re: On the possibility of guidance code points for the Private Use Area"
Maybe in reply to: Marco Cimarosti: "RE: How will software source code represent 21 bit unicode charac ters?"
Next in thread: addison@inter-locale.com: "RE: How will software source code represent 21 bit unicode charac ters?"
Reply: addison@inter-locale.com: "RE: How will software source code represent 21 bit unicode charac ters?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

William Overington wrote:
> In Java source code one may currently represent a 16 bit
> unicode character by using \uhhhh where each h is any
> hexadecimal character.
>
> How will Java, and maybe other languages, represent 21 bit unicode
> characters?

\uhhhh in Java source becomes a value of the 16-bit primitive datatype
"char".

A char corresponds to a Unicode value -- a UTF-16 code value, which could
either represent a Unicode character or one half of a surrogate pair. In the
latter case, it would take a sequence of two "char"s to make one Unicode
character. It is my understanding that Java's character encoding/decoding
mechanisms can handle this sort of thing already. However, this is not
obvious when looking at any Java platform documentation.

I do agree that it would be more convenient to be able to refer to Unicode
characters in Java source by their scalar value, so one would not need any
knowledge of UTF-16.

Next message: Rick McGowan: "Re: On the possibility of guidance code points for the PrivateUse Area"
Previous message: Michael Everson: "Re: On the possibility of guidance code points for the Private Use Area"
Maybe in reply to: Marco Cimarosti: "RE: How will software source code represent 21 bit unicode charac ters?"
Next in thread: addison@inter-locale.com: "RE: How will software source code represent 21 bit unicode charac ters?"
Reply: addison@inter-locale.com: "RE: How will software source code represent 21 bit unicode charac ters?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT