Re: "A Programmer's Introduction to Unicode"

From: Janusz S. Bien <jsbien_at_mimuw.edu.pl>
Date: Mon, 13 Mar 2017 11:31:28 +0100

Quote/Cytat - Richard Wordingham <richard.wordingham_at_ntlworld.com>
(Sun 12 Mar 2017 09:10:22 PM CET):

> On Sun, 12 Mar 2017 20:02:28 +0100
> "Janusz S. Bien" <jsbien_at_mimuw.edu.pl> wrote:
>
>> If the basic notion has to be referred in a cumbersome way as
>> "extended grapheme cluster" then it is easier to talk about "Unicode
>> characters" despite the fact that they have a rather loose relation
>> to real-life/user-perceived characters.
>
> The notion that extended grapheme clusters corresponds to
> user-perceived characters is also rather dodgy.

The idea is not mine, but it appears from time to time on the list in
a more or less explicit way.

> Whereas it may work
> for French, it is getting very dubious by the time one adds Hebrew
> cantillation marks or Vedic accentuation. The Thais revolted when
> their preposed vowels were joined with the following consonant in the
> same extended grapheme cluster, and Unicode had to revoke that union.

Just yet another reason for introducing the notion of textel?

Best regards

Janusz

-- 
Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra  
Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
Received on Mon Mar 13 2017 - 05:32:08 CDT

This archive was generated by hypermail 2.2.0 : Mon Mar 13 2017 - 05:32:09 CDT