From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Sep 19 2005 - 01:36:32 CDT
On Sun, 18 Sep 2005, Anto'nio Martins-Tuva'lkin wrote:
> On 2005.09.18, 07:58, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
>
>> Dead keys are an important practical problem. People have difficulties
>> in learning to use them. People may have used computers for many, many
>> years without ever realizing how they can use dead keys to type letters
>> with diacritic marks.
>
> Which locales are you refering to?
I worked long with users who need to write computer programs, commands, 
and other tech stuff that uses the tilde, the grave accent, and the 
circumflex as special characters (e.g., for negation, backquote, and 
exponentiation), with no apparent connection with any use as diacritics. 
Besides, as you know, the glyphs of the tilde and the circumflex don't 
really suggest much that they could be used as diacritics - they are far 
too big and wrongly positioned.
In such an environment, and in a less technical environment as well, 
people would normally not bother trying to use letters with diacritic 
marks, unless they appear as precomposed (say, as "é") in keycaps.
They simply omit the diacritics. After all, major publishing companies do 
that routinely as well - perhaps even as a matter of decided policy, not 
just lazyness.
Thus, when in the course of events someone wants or needs to type a letter 
with a diacritic, he will look for various methods like Character Map or 
Alt-something, never realizing that some keys on his keyboard are dead 
keys that could be used conveniently. Unless someone tells him, of course.
> My experience with coputer unsavvy
> people in Portugal is quite the opposite: Being used to type, say, [dead
> acute] [a] for U+00E1, some are truely shocked when they found out that it
> is not possible (with a portuguese keyboard) to get a U+0107 by typing
> [dead acute] [c] (this letter is not used in Portuguese, so most people
> here almost never need to type it, anyway).
I can understand that. I think the difference is that normal writing of 
Portuguese requires several different letters with diacritic marks and the 
Portuguese keyboard does not contain all of them as precomposed 
characters. Thus, people _need_ to learn to use dead keys for quite normal 
texts, even if they contain no foreign words. On the other hand, if a 
language normally needs just a few characters (like "ä" and "ö" only)
and they exist on the keyboard, perhaps with keys of their own, there is 
much less need to learn about the dead keys.
The example you mention, U+0107, illustrates well the problems of with 
dead keys. In a Unicode environment, it would be natural to extend their 
functionality, but this implies some problems too. If any combination of 
dead acute and a letter would produce an accented letter, if the 
combination exists in Unicode as a precomposed character, it would be 
easier to type foreign words and special notations - but it would also 
produce effects that are unexpected to many people.
For example, if someone is accustomed to using the acute accent as a 
single quotation mark, as in ´cat´, the extended functionality would turn
´c into a c with acute. Similarly, people who are used to writing URLs 
like http://www.example.com/~example by just using the tilde key, without 
knowing or caring about its being really a dead key, would be surprised at 
seeing the ~e change to e with tilde. Anything that conflicts with 
people's _habits_ of typing means problems and resistance.
The change might be worth it, at least in the long run, but users would 
need to be informed about it, and that's tough. Perhaps this should be a 
user-settable option with a default set, at least for some time, to the 
old behavior that people are used to. The extended functionality could 
then be advertized as helping people if they need to type foreign 
characters, rather than as a surprise and change to the customary.
The extended behavior could work by different criteria, exemplified with 
the following (white [acute] means a dead acute key, [´] means the 
(spacing) acute accent character, and [?] is any character):
- [acute][?] produces [?] with acute whenever this combination
   exists as a precomposed character in Unicode, otherwise it
   produces [?][´]
- as above, but with "whenever this combination exists as a precomposed
   character in a set of characters specified by locale settings, or
   by explicit user settings"
- [acute][?] always produces [?][combining acute accent], which might
   then be replaced by a precomposed character by NFC rules
(Especially in the third approach, it would be more logical to make the 
dead keys really keys for combining diacritic marks. Most people would 
probably find it more natural to add an accent _after_ typing a base 
letter, at least if they had no experience with how dead keys work.
But this would probably be too big a change now.)
The second, intermediate approach could use CLDR data about use of 
characters in different locales. But I think it would be a compromise that 
combines the drawbacks of the simpler alternatives. What matters here is 
not the user's native language but the _combination_ of languages he uses, 
and describing that would be practically difficult. Besides, the approach 
would make the surprise effect bigger: if you are accustomed to using dead 
keys in quite a many combinations with base characters, it will be awkward 
to note that some accented characters that are rare in your 
environment cannot be typed in that simple, convenient way,
for no obvious reason.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Mon Sep 19 2005 - 01:39:52 CDT