From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon Apr 28 2003 - 12:57:04 EDT
Doug,
> But SpecialCasing.txt, along with the rest of the UCD,
> defines how those algorithms should function and what entries the tables
> should have.
>
> Whatever technique Turkish-aware programs currently use to recognize (I
> and ı) and (İ and i) as case pairs, something similar should be done to
> Dutch-aware programs so they recognize IJ as an inseparable digraph with
> special casing behavior. This is *much* better than dragging the IJ and
> ij compatibility characters out of the closet.
Special casing deals with upper/lower case transformations not specifically a title case routine. The Turkish and Dutch work because all letters are converted.
I found title case to be a much more challenging problem. There are problems with Macintosh vs. MacIntosh but the hardest problem I found was the problem of French articles. This takes a syntactical analysis.
Carl
This archive was generated by hypermail 2.1.5 : Mon Apr 28 2003 - 13:57:03 EDT