Re: Unicode Cyrillic GHE DE PE TE in Serbian

From: Paul Keinanen (keinanen@sci.fi)
Date: Thu Jan 06 2000 - 06:46:22 EST


On Wed, 5 Jan 2000 02:33:57 -0800 (PST), "Janko Stamenovic"
<janko@teletrader.com> wrote:

>
>OK, the discussion went on to talk only about Russian.
>
>Now I'd have to explain you what is issue in Serbian Cyrillic.
>
>In Serbian written Cyrillic small t is written like two latin uu characters
>glued together (only one line in the middle) over which the top stroke
>exists. However this form never appeared in printed italic. But another
>problem for Serbians is that, contrary to Russian, Bulgarian or Macedonian,
>they actually very much use both Latin AND Cyrillic for their language.
>Since there are letters that look exactly the same in Latin and Cyrillic,
>for Serbians presence of more LATIN shapes in words can make them believe
>that they are reading Latin words, and not Cyrillic!
>
>Since both Latin and Cyrillic is used, Serbians always first "look" the
>whole word, determine which alphabet is used, and only then they read it.
>Sometimes they have to recognize the word, like: PECTOPAH is Serbian word
>for Restaurant written in Cyrillic only because "pectopah" (written with
>small letters it is obvious that it is not Cyrillic word) does not exist in
>the language.
>
>So unfortunatelly we really dont see appearance of "Latin m" and "Latin n"
>instead of "small Cyrillic t" and "small Cyrillic p" as "a small
>typographical issue".

Is this really any different from the situation that digit 0 looks
like Latin uppercase letter O and digit 1 looks like lower case Latin
letter L in some fonts. This was a bad problem in low resolution (5x7
and 7x9) dot matrix fonts, causing a lot of hard to find errors in
programming. Sometimes even a slash or a centre dot was added to
either 0 or O to distinguish them from each other. I think that it was
IBM who added the slash on one character for scientific computers and
the department responsible for commercial computers on the other
character :-).

Some mechanical typewriters did not even have the digits 0 and 1, but
the user was required to use the O and l letter keys to enter these
digits. This caused some problems, when people used to such
typewriters started to enter data into a computer system.

When later on higher resolutions became common, the shape of 0 and O
as well as 1 and l was modified so that they could be easily
distinguished from each other.

Returning to this Serbian font issue, it seems that the font designer
has simply copied the outlines for latin letters m and n for italic
small Cyrillic t and small Cyrillic p, causing the ambiguity. The
outlines for those two Serbian glyphs in the Serbian font should be
slightly altered, so that a sequence of Latin m immediately followed
by small Cyrillic t can be distinguished, at least if looking closer
to the text.

Paul



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT