RE: [OT] o-circumflex

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Sat Sep 08 2001 - 21:38:57 EDT


Asmus,

This discussion reminds me of my ill fated efforts to produce a manageable
set of rules to do automatic title casing starting with French text. It
would have required either special dictionaries or entering the text in a
special way. If special text was used, one could enter it in the proper
title case to begin with.

If you are entering Danish city names then enter it as Ålborg. You should
only use Aalborg where the font does not support Å. For matching logic you
can equate Å to Aa then the issue of compound words goes away.

Carl

> -----Original Message-----
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
> Behalf Of Asmus Freytag
> Sent: Saturday, September 08, 2001 5:56 PM
> To: Mark Davis; unicode@unicode.org; Francesco Zappa Nardelli
> Subject: Re: [OT] o-circumflex
>
>
> At 02:45 PM 9/8/01 -0700, Mark Davis wrote:
> >If you use a Danish tailoring of the UCA that equates Ã… and AA
> (at least at
> >a primary and secondary level), then they will sort the same
> way. A string
> >search that uses the same tailoring will also find "Ã…lborg" when given
> >"Aalborg" (and vice versa).
>
> But if you do this, all compound words starting with "data" and
> continuing
> with another word starting with "a" will be sorted incorrectly!
>
> To achieve this effect, you would have to mark which AAs are A-Rings and
> which ones are accidental adjacencies. In Danish one can use the
> SHY (soft
> hyphen) to break the latter, as these accidental pairs occur at
> legal word
> break points. In fact, that's the recommended solution, but it requires
> that the input data are in a sepecific form.
>
> A./
>



This archive was generated by hypermail 2.1.2 : Sat Sep 08 2001 - 22:36:54 EDT