From: Anto'nio Martins-Tuva'lkin (antonio@tuvalkin.web.pt)
Date: Sun Sep 11 2005 - 07:55:24 CDT
On 2005.09.10, 23:40, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
> Unicode contains _most_ accented letters used in human languages
> as precomposed characters, but not all. There's a clear distinction
> here.
Considering what canonical decomposition means, and that e.g. U+006F
U+0301 is absolutely identical to U+00F3, that distinction, however clear,
is meaningless. And of course we know why precomposed characters were
added in the first place — it is about legacy encoding of previous
standards with different views on combining characters, not a desire to
make a "distinction".
> my text was supposed to address people's intuitive expectations
But Jukka, for people with nothing more than intuitive expectations about
computer text processing the backstage works of what's a character and
what's not are completely transparent — they should not worry their heads
with such aracana. ;-)
--                                                                  ____.
António MARTINS-Tuválkin                                           |  ()|
<antonio@tuvalkin.web.pt>                                          |####|
Estrada de Benfica, 692-c/v d.ta         Não me invejo de quem tem      |
PT-1500-111 LISBOA                       carros, parelhas e montes      |
+351 934 821 700, +351 217 150 939       só me invejo de quem bebe      |
http://www.tuvalkin.web.pt/bandeira/     a água em todas as fontes      |
This archive was generated by hypermail 2.1.5 : Sun Sep 11 2005 - 07:56:01 CDT