From: Mark Davis (mark.davis@jtcsv.com)
Date: Mon May 26 2003 - 11:13:34 EDT
> Because there ARE words in Dutch where the combination i+j is not
> the same as ij (e.g. "bijectie"), and I wouldn't know how to
> formulate those situations in SpecialCasing.txt.
But this shouldn't be an issue. There are three possible cases:
bijectie
BIJECTIE
Bijectie
You would never have "Ij" in a word like that anyway, right? (except
something truely bizarre like "bIjEcTiE"...) So the only problem for
Dutch comes in the titlecasing (aka initial-caps) of "ij". So the
question then becomes:
* Are there any Dutch words of the form "ijx" (i.e. *starting* with
"ij") that are titlecased as "Ijx" (where x is any string of 0 or more
characters).
Only for those words would you need to distinguish between normal
"IJ"/"ij" and
U+0132 (IJ) LATIN CAPITAL LIGATURE IJ
or
U+0133 (ij) LATIN SMALL LIGATURE IJ
Otherwise the rule would be that in titlecasing, uppercase any "j"
after "I".
Mark
__________________________________
http://www.macchiato.com
► “Eppur si muove” ◄
----- Original Message -----
From: "Pim Blokland" <pblokland@planet.nl>
To: "Mark Davis" <mark.davis@jtcsv.com>
Sent: Monday, May 26, 2003 07:12
Subject: Re: Dutch IJ, again
> Mark Davis schreef:
>
> > > Why didn't I find a special casing rule for the *pair* of
> > > characters "ij" with Dutch (nl) in the UCD ?
> >
> > You didn't find it because although various people have
> > muttered about it in the past, nobody has yet made a
> > formal proposal to the UTC, listing all the specific changes
> > that would be needed for the text and data files.
>
> Oh... I assumed that what had happened was that the solution for the
> casing problem was to include ij and IJ in Unicode as singular
> codepoints.
> Because there ARE words in Dutch where the combination i+j is not
> the same as ij (e.g. "bijectie"), and I wouldn't know how to
> formulate those situations in SpecialCasing.txt.
>
> In the current situation, we DO have correctly cased codepoints (ij
> and IJ) if we need them, and we have the i+j where we don't need the
> "ij" sound, and if we type ij where we should have typed ij, it's our
> own fault, so other than muttering about awkward input methods for
> U+0133 and fonts that display U+0133 as a square, I don't think
> anything real will change soon.
>
> Pim Blokland
>
>
This archive was generated by hypermail 2.1.5 : Mon May 26 2003 - 12:05:02 EDT