From: Mark Davis (mark.davis@jtcsv.com)
Date: Sun Apr 27 2003 - 20:54:43 EDT
Wait just a second. The IJ digraph was added for compatibility with other standards, not necessarily
because it is really needed for Dutch. Unicode does not, in general, encode graphemes, except for
compatibility purposes; "ch" for Spanish and Slovak, for example, are not encoded.
Given the mass of data in Dutch that already use "i" + "j" to encode that grapheme, adding the "ij"
character will just confuse matters. When editing a mixture of such text, search/replace will not
identify the two; users will sometimes have to hit one backspace to delete what appears to be two
characters, sometimes hit two backspaces, etc. Bad idea.
The only concrete thing I have heard is that when titlecasing Dutch, "i" + "j" at the start of a
word should be titlecased as "I" + "J", not as "I" + "j". For that, one would request a change to
SpecialCasing.txt in the Unicode Character Database for the next version of Unicode. Kent Karlson
proposed this some time back; it may be time to revisit it, but we would need a proposal for the
next UTC.
Märk Davis
________
mark.davis@jtcsv.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799
----- Original Message -----
From: "Thomas Milo" <t.milo@chello.nl>
To: "John Hudson" <tiro@tiro.com>
Cc: "Chris Pratley" <chrispr@exchange.microsoft.com>; <Bob_Hallissy@sil.org>; <unicore@unicode.org>;
<unicode@unicode.org>; "Gerard Unger" <ungerard@wxs.nl>
Sent: Sunday, April 27, 2003 12:18
Subject: Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn)
> Hi John,
>
> At 02:49 AM 4/27/2003, Thomas Milo wrote:
>
> > >Would it be possible to make the IJ/ij available at last as a single
> > >character IJ/ij for Dutch users? MS Office seems to be unaware of this
> > >character (apart from correct shifting between upper and lower case). A
> > >spell check of IJstijd (correct Unicode) vs. IJstijd (improvised ASCII)
> > >approves of the - erroneous! - ASCII form and does not even recognize
> the
> > >horrendous misspelling Ijstijd.
> > >
> > >A web search of the Dutch word IJstijd (Ice Age) indicates that the use
> > >of this essential character is still practically zero.
> >
> > Whenever I've asked Dutch colleagues (type designers and typographers)
> > about the IJ/ij characters they've always expressed amazement that these
> > characters exist and most reject the need for them. 'Just use I and J'
> > seems to be the usual response. Tom is the only Dutch colleague I've ever
> > heard express support for the use of these characters. It is true that
> > there are special rules for how the letters I and J in combination should
> > be typeset in Dutch, but the same is true of lots of digraphs in German
> and
> > other languages that are not encoded as distinct characters and will not
> > be. I'm far from convinced that the IJ/ij characters are necessary or that
> > their use should be encouraged.
>
> No Dutchman - whether he is involved in type or not - can be amazed by the
> existence of IJ. If his name happened to begin with IJ, he would not be able
> to look up his own name in a telephone directory. With no exception IJ is
> taught in all schools as part of our handwriting as a ligature - just
> checked with my daughter. I called Gerard Unger about it and he pointed out
> that IJ is surrounded by a certain ambivalence: dictionaries list it either
> with I or with Y. The latter is enough to grant it graphemic status. And -
> like anybody would - he agrees that it capitalizes as one letter. As for
> your typographer friends, they mean: just compose it out of I and J (still
> in the streets of the Netherlands one frequently observes Ü with the left
> leg broken: the ligature of I and J). But this is all talk about glyphs.
> Unicode deals with graphemes, and there IJ is already recognized as such.
>
> IJ as a character is part and parcel of Dutch orthography and included in
> the Unicode Standard at the request of the Netherlands Standardisation
> Committee. There is no need to ask approval to use IJ/ij - the only point I
> am making is, that we still don't have a convenient way of entering it.
>
> Graphemically the use of IJ involves no complex rules at all. In Dutch ALL
> combinations of letters I and J - with extremely rare exceptions in foreign
> words like "bijoux", consequently corrupted into byoux by weak spellers -
> are instances of the single grapheme IJ. As a result, the common hack to
> type capital IJ with two upper case characters causes problems with spellers
> and grammar checkers. Moreover it leads to spelling and sorting errors (in
> some dictionaries and all telephone directories IJ mix with Y, but the hack
> moves it to I); automatic capitalization produces a revolting Ij, in rotated
> text IJ come apart, etc. etc.
>
> There is no need to put up with this hack: the Unicode Standard provides the
> correct solution and the industry obliged itself to implement it.
>
> t
>
>
>
This archive was generated by hypermail 2.1.5 : Sun Apr 27 2003 - 21:30:53 EDT