From: Peter Kirk (
Date: Sat Jul 10 2004 - 03:26:25 CDT
On 10/07/2004 01:34, Mark Davis wrote:
>I'll try to pick out the relevant points.
>>Please do. Do you really want all those letters
>>between "e" and "f" interfiled with "e"? I surely
>>do not.
>You seem to have a misperception of what I think we should be looking at.
>What I think we should be examining is which of the items that are not
>interfiled (to use your phrasing) should be, if any. I don't think
>everything should be. In particular, I think John's list is the list we
>should be focusing on.
>>John's list?
>That's was in my original mail, that you were commenting on when you changed
>the subject line, but which you didn't apparently didn't bother to actually
>read. Here is the text:
>>>If you look at John's suggested file for diacritic
>>>folding(, there are quite
>>>number that are not reflected in the UCA.
>>My point is made here. It is really only in
>>initial position where this is likely to be
>This is incorrect. It will make a difference in other positions. Sorting
>"Søren" after "Sozar" in a long list, if someone isn't expecting it, will
>cause problems. They look for it after "Soret", don't see it on the page,
>and assume it isn't there; fooled by the fact that it is on a completely
>different page.
I agree with you on this. I just checked this with some real data, a set
of several thousand e-mail messages from a list. One Danish participant
is Søren Holst and so called in the name field of his e-mails, but signs
himself "Soren" in messages in English. If I type "Soren" into the name
search box (in Mozilla 1.7), I get no matches. This is not what I
expect, because to me, and to Søren himself when thinking in English, ø
is a variant of o. (But actually Mozilla is inconsistent: when sorting
it put Søren after Sonny but before Soshie.)
-- Peter Kirk (personal) (work)
This archive was generated by hypermail 2.1.5 : Sat Jul 10 2004 - 03:27:17 CDT