Dan Oscarsson scripsit:
> Looking at the Unicode character data file I see that Unicode is
> inconsistant.
Obviously this needs to go in the FAQ.
> If you look att letter: 0xD8 it cannot be decomposed,
> but letter: 0xD6 can be decomposed.
>
> This is inconsistant because the glyph 0xD8 can be decomposed
> into letter o with a combining slash.
Combining-slash decompositions are considered to be over the top:
they're impossible to recombine at the glyph level accurately,
because the position of the slash varies randomly depending on the
base letter.
"The line must be drawn here!" -- J.-L. Picard
> The same inconsistancy exist for 0xC6 and 0xC4.
> The glyph of letter 0xC4 can be decomposed into letter a with a combining e.
U+00C4 is another boundary case: letter or ligature? But it is certainly
not equivalent to "ae" except in Latin (the language, not the script).
> It gets more inconsistant when you think about that the letter 0xC6 and 0xC4
> are the same letter, but one is a Norwegian/Danish version and the other
> Swedish.
In that context, yes. But they are not really equivalent in German, and even
less so in Finnish.
> Why does Unicode favor one language and an other not?
It does not.
> It can get worse when a font is created: a letter a with a diaeresis
> may be a different glyph than the letter 0xC4 (which have no English name).
High-quality fonts are always language-specific: we have already learned
that proper Polish fonts use differently placed accents from their
Western European analogues. Unicode is concerned with *plain* text,
in other words, whatever cannot be abandoned without abandoning legibility.
> I have seen several bad fonts where somebody thinks that the letter
> 0xC4 is a letter a with a diaeresis and just combined the two instead
> of having a true letter 0xC4.
Inevitably so.
> Unicode need to understand the difference between precomposed characters
> and those that are not (0xC4 is not a precomposed character, it is
> a single letter just like 0xC6).
No, it is is font designers who need to know when precomposed glyphs
work and when they don't.
-- John Cowan cowan@ccil.org I am a member of a civilization. --David Brin
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT