This is not always the right thing to do. For example, with personal names the
person involved may decide whether he prefers the old (AA) spelling or the new
�. In any case they are equivalent.
Jony
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Carl W. Brown
> Sent: Sunday, September 09, 2001 4:39 AM
> To: [email protected]
> Subject: RE: [OT] o-circumflex
>
>
> Asmus,
>
> This discussion reminds me of my ill fated efforts to produce a manageable
> set of rules to do automatic title casing starting with French text. It
> would have required either special dictionaries or entering the text in a
> special way. If special text was used, one could enter it in the proper
> title case to begin with.
>
> If you are entering Danish city names then enter it as �lborg. You should
> only use Aalborg where the font does not support �. For matching logic you
> can equate � to Aa then the issue of compound words goes away.
>
> Carl
>
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]On
> > Behalf Of Asmus Freytag
> > Sent: Saturday, September 08, 2001 5:56 PM
> > To: Mark Davis; [email protected]; Francesco Zappa Nardelli
> > Subject: Re: [OT] o-circumflex
> >
> >
> > At 02:45 PM 9/8/01 -0700, Mark Davis wrote:
> > >If you use a Danish tailoring of the UCA that equates Å and AA
> > (at least at
> > >a primary and secondary level), then they will sort the same
> > way. A string
> > >search that uses the same tailoring will also find "Ålborg" when given
> > >"Aalborg" (and vice versa).
> >
> > But if you do this, all compound words starting with "data" and
> > continuing
> > with another word starting with "a" will be sorted incorrectly!
> >
> > To achieve this effect, you would have to mark which AAs are A-Rings and
> > which ones are accidental adjacencies. In Danish one can use the
> > SHY (soft
> > hyphen) to break the latter, as these accidental pairs occur at
> > legal word
> > break points. In fact, that's the recommended solution, but it requires
> > that the input data are in a sepecific form.
> >
> > A./
> >
>
>
>
This archive was generated by hypermail 2.1.2 : Sun Sep 09 2001 - 01:44:11 EDT