On Thu, Sep 15 2016 at 16:36 CEST, john.w.kennedy_at_gmail.com writes:
[...]
> In the new Swift programming language, which is white-hot in the Apple
> community, Apple is moving toward a model of a transparent, generic
> Unicode that can be “viewed” as UTF-8, UTF-16, or UTF-32 if necessary,
> but in which a “character” contains however many code points it needs
> (“e” with a stacked macron, acute accent, and dieresis is
> algorithmically one “character” in Swift). Moreover,
> e-with-an-acute-accent and e followed by a combining acute accent, for
> example, compare as equal. At present, the underlying code is still
> UTF-16LE.
For several years I use the name "textel" (text element, in Polish
"tekstel") for such objects. I do it mostly orally in my presentations
for my students, but I used it also in writing e.g. in
http://bc.klf.uw.edu.pl/118/, unfortunately without a proper
definition. A rudymentary definition was provided for me only in my
recent paper in Polish: http://bc.klf.uw.edu.pl/480/. It states simply
(on p. 69) "an elementary text element independently of its Unicode
representation" (meaning in particular composed vs precomposed). I still
hope to formulate sooner or later a more satisfactory definition :-)
I think Swift confirms that such a notion is really needed.
Best regards
Janusz
-- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department) jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/Received on Thu Sep 15 2016 - 14:15:52 CDT
This archive was generated by hypermail 2.2.0 : Thu Sep 15 2016 - 14:15:53 CDT