From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Fri May 06 2005 - 09:09:33 CDT
On Sun, 1 May 2005, David Ulbrich wrote:
> Has anyone come accross the problem with accented/acuted vowels and
> iota-vowels in Russian/Ukrainian/Belarusian...? Though only used in
> textbooks and dictionaries as standard, the absence of these characters
> brings about really difficult problems in printing, often solved in a
> quite hardly acceptable way. Combining these characters with diacritic
> combination signs really does not give good results, and I do believe
> this would deserve separate signs.
As others have noted, good-quality implementations are possible even
though the characters are not encoded as separate Unicode characters
but only representable in Unicode by writing a base character followed by
a combining diacritic mark. But commonly used software, even if it is
capable of somehow displaying the combined character, is indeed rather
poor in rendering them.
I don't think it would be useful to add such characters into Unicode,
or even realistic - the general idea seems to be that new precomposed
characters will not be added. This saves work and coding space, and it
helps to avoid long discussions. After all, commonly used characters with
diacritic marks have already been incorporated into Unicode as precomposed
characters, so the rest are rather specialized. Well, Cyrillic letters
with diacritics aren't _that_ rare - as you mention, they appear in
textbooks and dictionaries (and grammars), and occasionally even in normal
text (e.g., I've seen an accent on Cyrillic o in the word that is
transliterated as "bolshaya", since in this word, the stress is
distinctive between the meanings 'big' and 'bigger').
If we think that the characters deserve "characterhood" in Unicode, the
natural step would be to define names for them, as defined in UAX #34,
"Unicode Named Character Sequences",
( http://www.unicode.org/reports/tr34/ )
I was actually somewhat surprised at seeing that the list of currently
defined named character sequences does not contain any Cyrillic letters
with diacritic mark. Maybe the idea has not become popular. After all,
defining such a sequence does not guarantee anything, and has no immediate
effect - but it might be a hint to implementors that the characters need
special attention. After all, the list _could_ be used so that separate
glyphs are designed for those characters, instead of relying on the
general algorithms that handle the rendering of diacritic marks.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Fri May 06 2005 - 09:12:07 CDT