From: Hans Aberg (haberg@math.su.se)
Date: Mon Nov 24 2008 - 15:27:10 CST
On 24 Nov 2008, at 19:19, Jukka K. Korpela wrote:
>> Perhaps one only needs to list the combinations that belongs to to
>> the proper language alphabets. In Swedish that would be
>> "ijåäöÅÄÖ". Other combinations, like é, would not be as
>> important to get right in Swedish, though it is imported from the
>> French where it would appear. But it illustrates the idea.
>
> Technically, in the Unicode sense, “i” and “j” do not
> contain a diacritic mark but are atomic (completely non-
> decomposable) characters, even though a discussion of diacritic
> marks must address the issue what happens to the dot in them.
Just as the other "åäöÅÄÖ" though some producible by combining
character combinations. So this is purely rendering question. If one
puts diacritical marks on top of "ij", then the dots should be
removed, and therefore TeX has undotted versions. My guess is that if
one would design a font model where diacritics can be constructed at
typesetting time, it would be convenient to do likewise.
> The description of characters used in a language or in a locale is
> addressed in the CLDR, see
> http://www.unicode.org/reports/tr35/#Character_Elements
> though very unsatisfactorily, if you ask me. It only addresses
> letters, and it defines rather arbitrarily just two character sets
> for a language. Surely, for example, “e” is more basically a
> letter in English than “é” is, but “é” in turn is more of
> an English letter than “ē” is. Moreover, the pragmatic reasons
> for defining the character repertoires contain quite irrelevant
> points like “choosing among character encodings.”
I think there is core, that is quite fixed. The other sets are just
imports, and more dynamic, and depends on what is being accepted.
Unicode itself may speed this pricess up, as there are no practical
restraints using them.
> Anyway, describing the characters commonly used in a language is
> useful for the purposes of font design. It is a difficult task,
> though, and controversial. In practice, such descriptions are
> probably more useful to people choosing between fonts than font
> designers. For example, when choosing a font for Swedish text, you
> should check that å, ä, ö, é, Å, Ä, Ö, É all look good. This
> should be self-evident, but it often isn’t. Moreover, less common
> characters are even more easily ignored. Thus, lists of characters
> used in a language (at various levels of usage) are directly useful
> for constructing test documents for font testing.
A think it is most important for the core letters to be properly
designed. But the example with "Å" being lowered in the Caledonia
font from 1967 to exactly as high as "l" shows that electronic
typesetting already has caused poor designs to happen. Font designers
probably do not keep track of these subtleties anymore.
So a typesetting model sufficiently advanced to do such adjustments
when combining characters on the fly might in fact produce better
results than now in use.
Hans
This archive was generated by hypermail 2.1.5 : Mon Nov 24 2008 - 15:30:45 CST