From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Aug 03 2004 - 09:08:24 CDT
From: "John Hudson" <tiro@tiro.com>
> Philippe Verdy wrote:
>
> > All the "glyph string" processing above is out of scope of Unicode....
>
> Yes, a renderer could be designed to work in this way, and fonts could be
designed to work
> with such a renderer. The issue is not whether rendering systems can be
made that would
> support the holam male / vav haluma distinction in this way, but whether
the distinction
> can be encoded in such a way that it will work reliably in multiple
rendering systems
> using different glyph processing models. I'm much more concerned about
existing rendering
> systems than I am about imaginary ones.
Uniscribe is not an "imaginary" rendering system. It effectively uses
strings of glyph ids but when you feed a string of characters, this string
is splitted into *multiple* strings of glyph ids, each one with its own
context of rendering flags.
This approach is effectively creating ATTRIBUTED TEXT from the origin
Unicode plain text. The glyph strings in Uniscribe have absolutely NO USE
without its context of rendering attributes.
You think this is acrobatic, yes it is! But there's a need for such
acrobacies in any string renderer that pretends supporting Arabic ligatures
and contextual forms, Brahmic letter reordering, or decomposition into
sub-glyphs that will be reordered differently...
As I say, all this is part of the job of the renderer, which is the only
place where glyph ids may be introduced and used, in collaboration with
fonts and other external data tables, or even under the control of a
user-specified stylesheet or linguistic context.
Assume that you need such a renderer to write Arabic (think about mirrored
characters), or Tamil, or even Han (think about ruby or interlinear
annotations, and vertical/horizontal presentation...) simply to correctly
render a plain-text document, then you have already the tools needed to
support distinctions like between <C1,C2> and <C1,ZWJ,C2> which is easy to
encode in the plain-text Unicode document.
The fact that some less advanced renderers will not be able to render the
distinction SHOULD NOT limit the possibility of using ZWJ/ZWNJ to create
additional distinctions in the plain-text. In the case of vav-haluma and
holam-vav, there's really a semantic distinction, which is an excellent
reason why such distinction should be encodable in the plain text, even if
some renderers will not be able to render differently....
This archive was generated by hypermail 2.1.5 : Tue Aug 03 2004 - 09:10:05 CDT