Re: "A Programmer's Introduction to Unicode" from Janusz S. Bien on 2017-03-12 (Unicode Mail List Archive)

From: Janusz S. Bien <jsbien_at_mimuw.edu.pl>
Date: Sun, 12 Mar 2017 20:02:28 +0100

Quote/Cytat - Manish Goregaokar <manish_at_mozilla.com> (Sun 12 Mar 2017
07:43:22 PM CET):

>> This is just another confirmation that the present Unicode terminology
> is confusing.
>
> I find this to be a symptom of our pedagogy around "characters" in
> programming; most folks get taught that characters are bytes are code
> points, especially because many languages try to make this the case.
> The name "grapheme cluster" could be improved upon, but it's not the
> primary source of this confusion.

I agree that it's not the primary source. However the pedagogy depends
on the terminology used.

If the basic notion has to be referred in a cumbersome way as
"extended grapheme cluster" then it is easier to talk about "Unicode
characters" despite the fact that they have a rather loose relation to
real-life/user-perceived characters.

Best regards

Janusz

-- 
Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra  
Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/

Received on Sun Mar 12 2017 - 14:02:50 CDT

This archive was generated by hypermail 2.2.0 : Sun Mar 12 2017 - 14:02:51 CDT