On Sun, Sep 18 2016 at 22:02 CEST, asmusf_at_ix.netcom.com writes:
> On 9/18/2016 3:26 AM, Janusz S. Bien wrote:
[...]
>> From the Unicode glossary:
>>
>> Grapheme. (1) A minimally distinctive unit of writing in the context
>> of a particular writing system.[...] (2) What a user thinks of as a
>> character.
>
> "writing system" is vague enough to cover variations that might be
> regional or language dependent.
That is obvious for me.
>>
>> As for (2), cf.
>>
>> User-Perceived Character. What everyone thinks of as a character in
>> their script.
>>
>> So we have "a user" versus "everyone...in their script" - is the
>> difference intentional? Probably not. Anyway the definitions are
>> language/locale dependent.
>
> The "everyone" here aims at a shared understanding.
That's also quite obvious for me.
"A user" is grapheme (2) is at least strange.
>
> This becomes tricky in the case of Abugidas. There's certainly a
> shared understanding that the "unit of writing" is the syllable,
> rather than in individual mark, but the latter do have well-understood
> identities, not least for teaching. That's perhaps the reason why
> there's the handwaving about "minimally distinctive".
>
> In some scripts like that, users can enter multiple sequences of
> characters that resolve (for all practical purposes) into the same
> syllable. (A big part of that in some scripts is that Unicode does not
> always provide a means to normalize the order of subsidiary signs and
> marks, typically combining marks)
>
> For some tasks it would be great to have only well-formed syllables;
> but to do that, you would need to add additional interpretation on top
> of the Unicode definitions of a grapheme cluster.
>
> If you just wrap the raw combining sequences into textels, then some
> tasks might not actually get simpler. Instead of a simple rule that
> determines which alternate orderings of marks are equivalent (to
> account for users not typing them in the preferred order) you would
> have to exhaustively list all combinations and set up equivalent
> tables.
I would like to know how Swift is handling this. I still have a feeling
that the Swift characters are almost exactly my textels.
Best regards
Janusz
-- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department) jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/Received on Mon Sep 19 2016 - 01:24:27 CDT
This archive was generated by hypermail 2.2.0 : Mon Sep 19 2016 - 01:24:28 CDT