>I just received the letter from Christopher John Fynn
[cfynn@dircon.co.uk] who pointed me to "UniScribe API in Win
2K".
>Guess what? The assumption of this API is that the run of
Unicode characters is enough to determine if the script is
"complex". This means again that the assumption of the author
is that *no information except the unicode characters themself*
is needed to properly render even *complex scripts*.
That's what Uniscribe does now, but that doesn't necessarily
mean that this is what's best, that it's what MS thinks is
best, or that it's all that MS will ever do. I certainly hope
they don't stop there, but that they go on to provide APIs that
are sensitive to a language identifier. And I suspect that they
will since (I believe) the same people that oversee the
Uniscribe team also oversee the OpenType team, and the latter
have provided support for language-specific rendering rules.
True, what might happen in the future doesn't provide any
solution today, but there's a lot that still can't be done
today in terms of handling multilingual text because the
technologies are still being developed. E.g. there are only a
few apps that I know of that can handle Nastaliq, and I don't
know of any non-proprietary system that can handle vertical
Mongolian.
As others have suggested, I'd say that the best road for the
long term will be to encourage developers to adopt a
text-handling infrastructure that provides all of the
functionality that is needed for all of the world's writing
systems, which includes labelling strings to indicate language.
Adding a handfull of additional characters to Unicode to solve
today a problem with details of presentation that relate to a
particular writing system is not a good basis for a long-term
solution.
>>In my opinion, if Cyrillic needs to be "complex script" is
more than questionable, since by their definition:
>A complex script has at least one of the following attributes:
>Allows bidirectional rendering.
>Has contextual shaping.
>Has combining characters.
>Has specialized word-breaking and justification rules.
>Filters out illegal character combinations.
>Compared to all this, Cyrillic is as simple script as Latin
is.
Again, I wouldn't take this as gospel. (In fact, I object to
the last characteristic.) It's turning out that Latin ligatures
aren't all that simple; so, as someone else has noted, the
simple/complex distinction is somewhat artificial. All scripts
have complexity; some are just more complex than others. (Or,
"They're all equally complex; some are just more equal than
others.")
Peter
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT