Re: Characters used in programming languages

From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Mon May 07 2001 - 18:48:56 EDT


At 4:42 PM -0400 5/7/01, From Net Link wrote:
>#Any programming language that wants to avail itself of the rich set of
>#punctuation, brackets, and other symbols found in Unicode must have at least
>#the following features:
>#
>#1. Commonly used symbols *must* be directly available on virtually all
>#Latin-script keyboards, not just by typing convoluted dead-key or
>#Alt-sequences.

For just a few added characters, you can set keyboard shortcuts, or
assign macros to keys. For more than that, bloating a Latin-script
keyboard is not the answer, even for adding all of the Extended Latin
blocks.

>We need more shift and shift lock keys so that more than ASCII
>can be done from a keyboard. Typing /U#### is also not acceptable.
>My program editor already uses Ctrl-Shift, Alt-Shift and Ctrl-Alt to get
>more key combinations on all the other keys but using two shift keys
>is awkward for
>anything used frequently.

Overloading Latin-script keyboards with more Buckybits is not
necessarily the best answer, although dedicated users of Wirth
machines or Emacs (Esc, Meta, Alt, Control, Shift) may tell you that
it is the only way to go. We already have usable alternatives.

o keyboard switching, the universal method for supporting multiple
alphabets and syllabaries

o picking from tables and menus, the universal method for visual entry of math

o Autocorrect functions which replace designated character sequences
with arbitrary text

o IMEs of the type used for CJK, the only practical way to support
scripts with thousands of characters.

It makes sense to use these in combination for the more complex
cases. We already have text editors and document-creation software
that can combine several of these methods on suitable operating
systems.

>We need Unicode keyboards.

One Unicode keyboard is impossible. A set of keyboards and IMEs
covering all of Unicode is certainly possible, but would take a lot
of work. For example, Cangjie for Chinese has been extended to all of
Big 5, more than 13,000 characters, but not to all of the CJK even in
the BMP. There are several hundred math characters (soon to be well
over a thousand), and numerous other non-linguistic characters.

>#2. Symbols must be easy to distinguish from each other, not just in a
>#professionally designed font but in ordinary handwriting, to prevent
>#confusion.

The only way to accomplish that in English math is to specify
alternate handwritten forms, such as the crossed '0', 'Z', and '7'
used in some programming and math contexts, and the Blackboard Bold
alphabet defined by the AMS. I defy anyone to come up with a method
for clearly distinguishing every CJK character in handwriting.

>Anything is better than looking at a < and asking,
>Is this a less than or is this an open template bracket.

There we agree.

>
>#
>#-Doug Ewell
># Fullerton, California
>#

-- 

Edward Cherlin, Spamfighter <http://www.cauce.org> "It isn't what you don't know that hurts you, it's what you know for certain that just ain't so."--Mark Twain, Josh Billings, Edwin Howard Armstrong, Will Rogers, Satchel Paige (after Thomas Jefferson)



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT