Jukka K. Korpela
jkorpela at cs.tut.fi
Wed Jun 4 12:36:43 CDT 2014
2014-06-04 20:15, Andre Schappo wrote:
> Well because outside of groups like this there is still little awareness
> of Unicode, little understanding of Unicode, little willingness to use
> Unicode and little conscious usage of Unicode
That’s very true. In the specific case of “using Unicode” (which so
often means just “using characters outside the Ascii repertoire”) in
programmin language identifiers, there are other reasons affecting, too.
As alluded to here:
> On 4 Jun 2014, at 16:53, Shawn Steele wrote:
>> I rarely see non-Latin code in practice though, but of course I’m a
>> native English speaker.
The point is that English is largely the de facto standard human
language in programming—in documentation, comments, and hence also in
forming identifiers, even though the data processed might be in
different languages. There are good practical reasons for using English:
programmers can be expected to understand it, and it is generally the
only language you can expect them to understand.
People also learn by example, and they often learn to stick to Ascii
without even thinking why. Where I live, they learn to replace “ä” and
“å” by “a” and “ö” by “o” rather automatically when they use words of
national languages as identifiers. If you ask them, they probably say
that the Scandinavian letters cannot be used reliably, which is often so
true, even though it might not apply to the use in some programming
Personally, I often favor identifiers in the national language for
clarity: this distinguishes user-defined identifiers from reserved words
and from identifiers defined in libraries. But this is useful mostly in
tutorial material, not that much in routine programming.
More information about the Unicode