Lithuanian (was Re: Transliteration of Arabic characters into English)

From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Sun May 14 2000 - 21:32:59 EDT


At 07:44 -0800 5/12/2000, brendan_murray@lotus.com wrote:
>Vladas Tumasonis <vladas.tumasonis@maf.vu.lt> wrote:
> > Really we have problem for adding Lithuanian
> > Accented Letters to Unicode.

As I understand the principles of Unicode, the proposal to add code
points for Lithuanian accented letters will be and should be
rejected. The encoding of numerous accented letters from preexisting
standards can give the erroneous impression that accented letters are
supposed to have Unicode code points. There is in fact no way that
all accented letters could be encoded, since combining accents are
productive. They can be used in any new combinations that anyone
wants.

The Unicode Standard Version 3.0, pp. 17-18, explains that the
accented letters presently in Unicode are compatibility characters,
provided for round-trip conversion from an earlier standard to
Unicode and back. New files should use the canonical decompositions
of such letters.

The glyphs for the accented Lithuanian characters do not need Unicode
code points. Current font formats allow for tables of glyph
properties, so that rendering software can determine which glyphs to
use for rendering sequences of combining characters.

As a temporary measure while waiting for correctly implemented fonts,
you can use the Private Use Area to give your glyphs Unicode code
point numbers.

>See
> > http://www.mif.vu.lt/tk4/lithacc/
>
>I stand corrected: I was not aware of the extended Lithuanian characters -
>I only knew what this document refers to as the "main alphabet". One thing
>is unclear from these documents: are the characters of the extended
>alphabet used as part of the Lithuanian language, or are they used to
>indicate pronounciation?

http://www.mif.vu.lt/tk4/lithacc/2-1.htm states:

  Usage of accented letters goes back to the first Lithuanian writings. The
  first Lithuanian books were accented, e.g. "Kathechismas" (1595) and
  "Postilla catholicka" (1599). At present, the publishing practice all
  dictionaries, special vocabularies and encycklopaediae are accented.
  Accented letters are used in textbooks for schools, reference books,
  linguistic texts, and in publication of laws.

So the accents are used nowadays to indicate pronunciation, and are
not mandatory. Nevertheless, they appear to be part of the language.
The analogy is not complete, but there is a similarity with Hebrew
vowels, which are not required for ordinary texts, but are used when
exactness is important.

>I tried some of the combining characters from the list in Notes, and they
>displayed OK, although my font is rather ugly. I think the main issue with
>these characters and, in fact, with all combining characters, is the input
>method. It's easy enough to input them into a ducment using charmap, but
>what's really needed is some input method that's easy to use. Of course one
>needs a font that'll process the data correctly, but it's probably easier
>to find that than to find some acceptable input method.
>
>B=

The standard extended Latin alphabet input methods will be quite
adequate. Lithuanian keyboard layouts can have whatever combinations
of precomposed characters and combining accents the Lithuanians are
comfortable with, including an i that does not lose its dot when
accented.

Unfortunately the only Lithuanian keyboard layout I have access to is
the rather feeble VOA Lithuanian keyboard in Unitype Global Writer.
It includes a combining ogonek and the precomposed u&lc forms of five
accented letters, but no other accents. This sort of lack is a
serious problem, but it is not a Unicode problem.

PS I don't know any Lithuanian, but my grandfather was from Vilnius, aka Vilna.

Edward Cherlin
Generalist
"A knot!" exclaimed Alice. "Oh, do let me help to undo it."
Alice in Wonderland



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT