From: Gregg Reynolds (unicode@arabink.com)
Date: Tue May 24 2005 - 03:01:52 CDT
Dean Snyder wrote:
> In fact, that's why I said that transliteration is almost tautologically
> a loss of glyphic information.
>
> Both you and Gregg are completely missing my point. The whole purpose of
> transliteration is to render characters of one script in another, which
encodings are not scripts, they're mathematical objects
> almost by definition, or tautologically, means that there is a loss of
> glyphic information when one transliterates. In fact, that is arguably
> the main reason one transliterates - to substitute the glyphic
> information in the source script with different glyphic information in
> the destination script.
Arguably maybe; but I don't think so. I think it's about identity, not
"glyphic information".
I gave several examples where glyphic
> information, in ancient texts, for example, is important information
> that is not conveyed when those texts are transliterated. Hence the
> utility of encoding those scripts.
Well I wouldn't argue against the utility of such an encoding; but
unfortunately the "transliteration is lossy" argument works against you,
for a very simple reason:
*computational models of "characters" encode no "glyphic information"*
None. Nada. Zipzilchzero. x0041 encodes Latin upper case A; it encodes
an identity; it does not encode "glyphic information". Not even a set
of glyphs. It's a theoretical impossibility. (btw Unicode has always
been a bit confused about this.)
And it's fairly easy to see this. There is no rule you can find that
will tell you, for any given image, if it is a member of the set of all
Latin upper case A glyphs. Pretty much any blob of ink can be construed
as "A" in the right context. It's also impossible to enumerate all "A"
glpyhs.
(Idea for a contest: slap a blob of ink in a random pattern in an
em-square; a sufficiently creative typeface designer will be able to
design a latin font in which the blob will be recognizably "A". Free
beer for a week to the best design.)
So even if you encode your ancient scripts, you are not protected
against the kind of lossiness you want to avoid. There's always a font
and a rendering logic involved. You're lost as soon as you lay finger
to keyboard and your idea of a glyph is transl(iter)ated into an
integer. To guarantee correct decoding of a message in the way you
(seem to) want, you would have to transmit specific glyph images along
with the encoded message; in which case there's not much point of
designing an encoding.
Take a look at Douglas Hofstadter's essays on Metafont in "Metamagical
Themas" for some fascinating discussion of such stuff.
-gregg
This archive was generated by hypermail 2.1.5 : Tue May 24 2005 - 03:02:16 CDT