Re: "textels"

From: Janusz S. Bien <jsbien_at_mimuw.edu.pl>
Date: Fri, 16 Sep 2016 17:57:44 +0200

Quote/Cytat - Eric Muller <eric.muller_at_efele.net> (pią, 16 wrz 2016,
17:47:27):

> On 9/16/2016 8:30 AM, Janusz S. Bien wrote:
>> Quote/Cytat - Eric Muller <eric.muller_at_efele.net> (pią, 16 wrz
>> 2016, 17:03:54):
>>
>>> On 9/16/2016 6:52 AM, Janusz S. Bień wrote:
>>>> (when working on a corpus of historical Polish we
>>>> noticed some cases where standard Unicode equivalence was not
>>>> convenient).
>>>
>>> I'm very interested to know more about those cases.
>>
>> For our search engine we were unable to use compatibility
>> equivalence "out of the box" for splitting the ligature because it
>> also converted long s to short s while we wanted to preserve the
>> distinction.
>
> I am interested in the problems with *canonical* equivalence. I
> thought that you were talking about those before.

I apologize for the confusion, that was my fault. I tend to answer too
quickly and not precisely enough :-(

On the other hand I'm not sure canonical equivalence is always what I
want and expect, but I don't have specific examples at hand.

Regards

Janusz

-- 
Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra  
Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
Received on Fri Sep 16 2016 - 10:58:01 CDT

This archive was generated by hypermail 2.2.0 : Fri Sep 16 2016 - 10:58:01 CDT