Re: Terms "constructed script", "invented script" (was: FW: Re: Shavian)

From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Sat Jul 07 2001 - 07:01:18 EDT


In a message dated 2001-07-06 0:31:39 Pacific Daylight Time, 11@onna.com
writes:
 
> I wonder: why aren't languages with simple syllabic structures
> written in hiragana? It seems to be built for them.

I am using my own script inspired by hiragana 10 years ago for writing
Polish. It looks very differently, I only liked the idea of having
letters for consonant+vowel pairs and stretched it a bit.

I put a sample at <http://qrczak.ids.net.pl/vi-001.gif> (resolution
suitable for printing at 300dpi). For example the subject says:
"Re: vi (Re: O wyższości znaku zachęty nad GUI)", i.e. "Re: vi (Re:
About the superiority of command-line prompt over GUI)", which has
only 11 letters between the second "Re:" and "GUI".

I won't dare proposing to encode it in Unicode. The number of users
is approaching two. But technically it's an interesting script with
a non-trivial rendering engine. I implemented the rendering engine
and a translator from standard Polish orthography (not perfect due to
ambiguities in our orthography - I modified the orthography a little
to resolve them). I did it to practice reading. I could only practice
writing before - it's hard to read what you just wrote, because you
remember what you wrote!

Letters are composed from core characters by the engine. There
are 35 consonants, 8 normal vowels, 1 extra vowel, joiner, and
non-joiner. They produce an unbounded number of letters.

(1) Adjacent consonants are joined up to some limit (2 is a good
choice, but there is no semantic difference here) and they are joined
with the following vowel if present (this is mandatory).

(2) A consonant+vowel pair must be split if this is a border
between a prefix and a stem or the like. Such pairs are also split
in some foreign words to force correct pronunciation (pronunciation
of a consonant sometimes depends on the following vowel and vice
versa). Non-joiner is used to encode such splitting in the stream of
core characters.

(3) The default (greedy) splitting of chunks of consonants is not
always perfect, e.g. when it would join a final part of a prefix with
the beginning of the stem. Joiner and non-joiner are used to prevent or
force splitting at certain points between consonants. Forced joining
overrides the limit of joined consonants.

(4) Any two letters can be joined by writing one above another with a
dot between. This is never required by the orthography but is sometimes
a good style, e.g. in the "od" prefix and in diphtongs. Joiner is
used to encode that.

Finally there are cases where a consonant+vowel pair is split according
to (2) and then joined according to (4). I am encoding such case with
joiner + non-joiner + joiner. I think that there is already a similar
practice in Unicode used for Arabic ligatures.

Actually I'm not using even PUA characters but an ASCII-based escaping
scheme, because I don't have an editor capable of editing text in
such a script. But simple non-joined letters put in a font with the
ability to directly edit joiners and non-joiners would be technically
workable. The meaning of a text file would then be unambiguous modulo
PUA assignment (the ASCII-based escaping is a hack).

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK



This archive was generated by hypermail 2.1.2 : Sat Jul 07 2001 - 07:56:09 EDT