User-perceived character (was: "textels")

From: Janusz S. Bień <jsbien_at_mimuw.edu.pl>
Date: Mon, 19 Sep 2016 08:23:53 +0200

On Sun, Sep 18 2016 at 22:02 CEST, asmusf_at_ix.netcom.com writes:
> On 9/18/2016 3:26 AM, Janusz S. Bien wrote:

[...]

>> From the Unicode glossary:
>>
>> Grapheme. (1) A minimally distinctive unit of writing in the context
>> of a particular writing system.[...] (2) What a user thinks of as a
>> character.
>
> "writing system" is vague enough to cover variations that might be
> regional or language dependent.

That is obvious for me.

>>
>> As for (2), cf.
>>
>> User-Perceived Character. What everyone thinks of as a character in
>> their script.
>>
>> So we have "a user" versus "everyone...in their script" - is the
>> difference intentional? Probably not. Anyway the definitions are
>> language/locale dependent.
>
> The "everyone" here aims at a shared understanding.

That's also quite obvious for me.

"A user" is grapheme (2) is at least strange.

>
> This becomes tricky in the case of Abugidas. There's certainly a
> shared understanding that the "unit of writing" is the syllable,
> rather than in individual mark, but the latter do have well-understood
> identities, not least for teaching. That's perhaps the reason why
> there's the handwaving about "minimally distinctive".
>
> In some scripts like that, users can enter multiple sequences of
> characters that resolve (for all practical purposes) into the same
> syllable. (A big part of that in some scripts is that Unicode does not
> always provide a means to normalize the order of subsidiary signs and
> marks, typically combining marks)
>
> For some tasks it would be great to have only well-formed syllables;
> but to do that, you would need to add additional interpretation on top
> of the Unicode definitions of a grapheme cluster.
>
> If you just wrap the raw combining sequences into textels, then some
> tasks might not actually get simpler. Instead of a simple rule that
> determines which alternate orderings of marks are equivalent (to
> account for users not typing them in the preferred order) you would
> have to exhaustively list all combinations and set up equivalent
> tables.

I would like to know how Swift is handling this. I still have a feeling
that the Swift characters are almost exactly my textels.

Best regards

Janusz

-- 
                           ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
Received on Mon Sep 19 2016 - 01:24:27 CDT

This archive was generated by hypermail 2.2.0 : Mon Sep 19 2016 - 01:24:28 CDT