Re: Merging combining classes, was: New contribution N2676

From: Jim Allan (jallan@smrtytrek.com)
Date: Thu Oct 30 2003 - 09:48:50 CST


I offered a suggestion on cedilla and combining undercomma:

> / It seems to me that Cedilla/undercomma folding would be a useful /
> /addition to "Character Foldings" at
> http://www.unicode.org/reports/tr30. /

and Philippe Verdy responded:

> Excellent idea, however it has to be tailored by language:
>
> For example, Turkish and French (which almost always and consistently use
> preferably a cedilla) behave differently of Romanian and Latvian (which
> should use preferably a comma below).

No.

Forced tailoring by language would greatly reduce the usefulness of such
foldings for search purposes.

One wants to find matches for Romanian and Latvian personal names or
place names or individual forms using cedilla or undercomma regardless
of the language in which they are embedded.

Similarly Turkish forms normally spelled with cedilla would be found
regardless of language even if a undercomma rather than a cedilla had
been used in the spelling (perhaps by error or perhaps purposely to
adapt a name to Romanian or Latvian style).

One wants cedilla and undercomma to match in a search in legacy code
pages regardless of the transliteration table to Unicode that is used by
a particular application.

Jim Allan



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST