Support for Latin ligature IJ (was another thread)

Martin J. Dürst duerst at
Thu Mar 31 00:51:55 CDT 2016

On 2016/03/31 06:42, Philippe Verdy wrote:

> The use of "ÿ" in Dutch should also be considered as an orthographic fault,
> and it should be corrected into "ij" (to solve the capitalization problem),
> but there are occurences in Dutch of "ÿ" which is correct (notably in
> borrowed French toponyms such as "L’Haÿ-les-Roses")
> There may be similar examples in Belgium with French toponyms, but I
> suspect that those Belgian-French toponyms have their own Dutch
> "officialized" variant which would be preferable without borrowing the
> Belgian-French orthography,

I'm not too familiar with the local Belgian customs for place names, but 
in general, correspondences will not be that simple. There may be cases 
with exactly the same spelling (but different pronunciation), cases with 
simple spelling differences, cases with different words (same or 
different meanings), and so on.

> so that they will not need "ÿ", and they will
> likely use "ij" instead, meaning that the autocorrection of "ÿ" from
> possible Belgian-French toponyms into "ij" will also be correct for
> Dutch-Belgian toponyms ; it may also be correct for French-French toponyms
> like "L’Haÿ-les-Roses" transformed into "L’Haij-les-Roses" in Belgian-Dutch,
> or "L’HAIJ-LES-ROSES" if capitalized, if autocorrected this way; it would
> however be incorrect to replace there the "ij" (or IJ) letter by the two
> letters "ij" (or "IJ") without the orthographic ligature...

I'm not an expert in French or Dutch pronunciation or orthography, but 
as far as I understand, transforming "L’Haÿ-les-Roses" to 
"L’Haij-les-Roses" would be wrong because it would lead to a wrong 
pronunciation; if anything, "L’Hij-les-Roses" would be closer.

> By curiosity, I looked into the Dutch Wikipedia to see how they wrote
> "L’Haÿ-les-Roses"
> and they don't transform the French "ÿ" into some Dutch "ij" (and they don't
> have any other "officialized" Dutch orthography.

With Unicode, there's less and less of a need to "officialize" such 
spellings, even though of course whether to do so or not will continue 
to depend on other factors such as culture and official policy.

> For this reason, the autocorrection of the "ÿ" letter into the "ij" letter
> in Dutch is disabled by default (even if it would be needed to look into
> old documents encoded with ISO8859-1).
> The situation is more complex for the autocorrection of the "ij" digram
> (extremely frequent in old documents encoded with ISO8859-1) into the plain
> "ij" letter, which seems to be active in various wordprocessors (but which
> causes problems with borrowed non-Dutch names).

Such problems these days can be solved by using context-sensitive 
corrections, either with something close to regular expressions 
(detecting typical Dutch spellings) or dictionaries.

Regards,   Martin.

More information about the Unicode mailing list