> A 10:55 99-06-21 -0700, Figge, Donald a écrit :
> >Because these two characters are unified, the composition software needs to
> >be smart enough to know that a word can be divided between two vowels when
> >one of them has a diaeresis mark, but not necessarily if the same mark is
> >intended to serve as an umlaut.
> >
> >The argument that alphabetic characters are pronounced differently in
> >various languages but still have the same code point misses the point of my
> >original question which is why unification when the umlaut and diaeresis
> >have different basic functionalities.
>
This is IMHO largely a question of pragmatics.
On the one hand, DIN 5007 (the German ordering standard) indeed
distinguishes between umlaut and trema, handling them quite differently.
E. g. in names, a with umlaut is mirrored to ae on level 1 whereas a with
trema is mirrored to a (that is, a German and a French author are treated
distinctly even if seemingly spelled alike). One awkward consequence of
this is that one of the best accepted national ordering standards cannot
really become a profile of ISO/IEC FCD 14651.
Most German library software (and software dedicated to fine typography)
have two encodings for umlaut and trema (e. g. in our product, TUSTEP, the
former is encoded ^a, ^o, ^u and the latter as %:a, %:o, %:u, which poses
a serious problem for our filter to Unicode as we cannot guarantee the
round-trip-convertibility our customers would like to see). German
typesetters used to have two glyphs for the diaeresis and the umlaut which
were quite distinguishable (alas, that's gone with the advent of
Postscript -- the direct result of insufficient internationalization in a
de facto industry standard).
On the other hand, the knowledge of these facts is vanishing in Germany
itself and the overwhelming majority of people would not be able to
distinguish nowadays between these two diacritics (and I maintain that
these *are* in principle as different diacritics as acute and grave) --
for that matter, most would not know what a trema is in the first place.
Therefore, a lot of confusion would be likely to result from a
disunification of these two diacritics, and this confusion would in all
likelihood outweigh the advantages.
Alain is right that language tagging might be the best way out of this
dilemma.
Best regards,
Marc Kuester
--*************************************************** Marc Wilhelm Kuester
Computing Centre of the University of Tuebingen Dept. Literary and Documentary Data Processing Waechterstr. 76 D-72074 Tuebingen
Tel.: +49 / 7071 / 29-70348 Fax: +49 / 7071 / 29-5912 EMail: marc.kuester@zdv.uni-tuebingen.de
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:47 EDT