From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Sep 30 2007 - 07:12:35 CST
> -----Message d'origine-----
> De : unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] De la
> part de Serge Rosmorduc
> Envoyé : dimanche 30 septembre 2007 12:58
> À : verdy_p@wanadoo.fr
> Cc : 'Unicode Mailing List'
> Objet : Re: FPDAM5: Egyptian hieroglyphs (was Re: Marks)
>
> Philippe Verdy a écrit :
> > Also I just wonder how the proposed encoding can be sufficient to
> correctly
> > encode any Hieroglyphic texts, given that it contains NO combining
> > character, and no layout control characters for representing the quadrat
> > layout.
> >
> >
> It was decided to leave the sign layout outside of the character
> encoding, as the simple combining characters you think of, if they give
> an acceptable approximation,
> are far from covering all possibilities. In particular, in some cases,
> egyptologists will require exact positioning -- so, basically, one would
> need to put into unicode a system which is quite clearly outside its
> bounds.
I can understand that there are some special applications that will need
very precise position and glyph layout. But the same could be said to ALL
scripts.
The main problem I see is that ALL hieroglyphic texts, even the most basic
ones where advanced positioning is needed, will require a specific renderer
and, even worse, specific encoding conventions.
My intent is not to represent the EXACT layout, but the composition
relations that exist between each part. The most basic relation being first
the quadrat grouping when such grouping exists, and then the "before" and
"above" relations, which are not necessarily specifying the exact layout,
but the most frequent way they semantic distinctions are made, for the most
frequent use: the determinatives.
A <h1, before, one> relation is semantically very different (meaning the
figurated meaning of h1 in most frequent cases) from <h1, above, one>
(meaning the logographic meaning of h1 in most frequent cases) and from <h1>
(meaning the phonetic value of h1). If we just encode <h1> and <one> as
symbols, absolutely NO text makes any sense because it is FULLY ambiguous.
It's acceptable not to encode one of the relations, i.e. the simple
juxtaposition without grouping ("-" in MdC notation), which is the most
frequent case. But making all relations equal by not encoding them seems not
reasonable.
I wonder why there's absolutely no encoding proposed for treating the basic
relations (":", "*" in MdC) as format controls, and modifications ("\" and
possibly the rotations) as combining modifiers: this would be useful to keep
the possibility of encoding hieroglyphs in plain-text.
Otherwise, it's completely impossible to give any meaning to hieroglyphs in
plain text. Even their assigned properties (gc=Lo) does not make sense given
you can't make any meaningful words with them. They are in fact just treated
like symbols. And the rationale about their encoding as letters (for
allowing their use in identifiers without breaks between them) does not make
sense either.
The resulting situation would be much like if Hangul was encoded without
encoding the distinctions between initial and final consonants (something
that was attempted in the past then abandonned as it was not reliable), but
here it is even worse as there's absolutely no way to determine any
boundaries within the encoded hieroglyphic text. This makes the rendering
completely impossible to perform, and the transport of texts with Unicode
just illusory.
If you need to encode these characters using an external encoding
convention, then it will be much better to use MdC conventions with the
existing Garland names... The Unicode encoding serves absolutely nothing. It
does not help transforming the MdC encoding into plain-text, it just adds a
complication for egyptologists.
This archive was generated by hypermail 2.1.5 : Sun Sep 30 2007 - 07:17:30 CST