Jukka K. Korpela jkorpela at cs.tut.fi
Wed Jun 4 12:36:43 CDT 2014

2014-06-04 20:15, Andre Schappo wrote:

> Well because outside of groups like this there is still little awareness
> of Unicode, little understanding of Unicode, little willingness to use
> Unicode and little conscious usage of Unicode

That’s very true. In the specific case of “using Unicode” (which so 
often means just “using characters outside the Ascii repertoire”) in 
programmin language identifiers, there are other reasons affecting, too. 
As alluded to here:

> On 4 Jun 2014, at 16:53, Shawn Steele wrote:
>> I rarely see non-Latin code in practice though, but of course I’m a
>> native English speaker.

The point is that English is largely the de facto standard human 
language in programming—in documentation, comments, and hence also in 
forming identifiers, even though the data processed might be in 
different languages. There are good practical reasons for using English: 
programmers can be expected to understand it, and it is generally the 
only language you can expect them to understand.

People also learn by example, and they often learn to stick to Ascii 
without even thinking why. Where I live, they learn to replace “ä” and 
“å” by “a” and “ö” by “o” rather automatically when they use words of 
national languages as identifiers. If you ask them, they probably say 
that the Scandinavian letters cannot be used reliably, which is often so 
true, even though it might not apply to the use in some programming 

Personally, I often favor identifiers in the national language for 
clarity: this distinguishes user-defined identifiers from reserved words 
and from identifiers defined in libraries. But this is useful mostly in 
tutorial material, not that much in routine programming.


More information about the Unicode mailing list