J. Leslie Turriff
jlturriff at centurylink.net
Thu Jun 5 05:04:11 CDT 2014
On Wednesday 04 June 2014 10:53:59 Shawn Steele wrote:
> I’m sort of confused why Unicode would be a big deal. C# & other languages
> have allowed unicode letters in identifiers for years, so readable strings
> should be possible in almost any language.
> It’s a bit cute to include emoji, but I’m not sure how practical it is. It
> also makes me wonder how they came up with the list, I presume control
> codes aren’t allowed? Or alternate whitespace? I assume they use some
> Unicode Categories to figure out the permitted set?
> I rarely see non-Latin code in practice though, but of course I’m a native
> English speaker.
What I find interesting is that (with the possible exception of Ada) I don't
think that any of the commonly used languages allow for the use of Unicode
characters for non- user-defined tokens (i.e. reserved words, etc.).
I'm working on a parser for the Rexx language that will allow all tokens to
be recognized using the default (or a user-specified) locale, not just the
user-defined tokens. It will also allow various single-character operators
equivalent to the multiple-character ones defined in the current language
standard (e.g. '≠' for '¬=', '<>' or '\=', '≤' for '<=', '≥' for '>=',
"Disobedience is the true foundation of liberty. The obedient must be
slaves." --Henry David Thoreau
More information about the Unicode