Re: Latin w/ diacritics (was Re: benefits of unicode)

From: Peter_Constable@sil.org
Date: Tue Apr 17 2001 - 15:41:08 EDT


On 04/16/2001 09:02:16 PM unicode-bounce wrote:

>> How do you handle these? You wait till the rendering technology catches
up,
>> or you build your own (e.g. Graphite) and build apps that work on that.
I
>> suspect (or, at least, certainly hope) we'll see progress in this regard
in
>> IE 6.
>>
>
>Waiting isn't much of an option, the users need results now.

I'm entirely sympathetic with this. If you need to work now and solutions
aren't being provided for you, you build your own and by hook or crook do
whatever it takes to allow you to get done what you need to do. In SIL,
we've been in that boat for a *long* time and have tried to make a fine art
of allowing people to build their own solutions. We continue to offer that
to our users when that's what's needed. I long for the day, though, that we
have it all behind us.

>How many years has it been since the combining diacritic
>range was established and how many applications currently
>adequately support it?

Not many I agree. But there are several changes that have needed to take
place, and these things have been happening, so I'm hopeful that we'll see
commercial apps handling stacking Latin diacritics before too long -- at
least MS apps. There will always be users that need something that
commercial vendors aren't providing, though, and so we continue to promote
the notion of having extensible implementations. That's a key part of what
the Graphite font technology is all about: don't have the shaping behaviour
you need? Define your own!

>Even when the rendering technology catches up, the old 386's
>and such that are in use in places like the Sudan may not be able
>to support an OS capable of using new rendering technology.

That is indeed a problem. It's not one that technologists are good at
solving, if for no other reason than because they have little option but to
develop for collective newer technology. E.g. revamping Win3.x code to
provide support for Unicode and for smart-font rendering that can run on a
386 with 4MB RAM wouldn't exactly be enjoyable work, even if such a project
could be given resources.

There is also an issue of practical feasibility, though: smart-font
rendering technologies are not fast. They depend on fast CPUs to give
adequate performance. Running Uniscribe/OT or Graphite on a 25MHz 386SX
probably wouldn't be pleasant for the user.

>Peoples in depressed areas would probably benefit greatly from
>being able to compute in their own languages.
>
>Building applications to provide special support while waiting for
>the technology to advance, as Peter mentions, would be one way
>to overcome these display problems.

But I wonder if they won't get better results and sooner as spillover from
advances in globalised software based on Unicode and smart fonts. If the
next version of Uniscribe turns on OT shaping for Latin so that stacking
diacritics can be supported, then that will probably work in IE and in
Office XP. Building apps is very difficult and likely beyond the reach of
most in Sudan. Building OT fonts and input methods isn't easy, but is
attainable by more. If apps and OSes are written to be generally friendly
to the world's scripts, then people can build fonts and IMs and start
working with their less-well known writing systems using the same
commercial-grade applications that those in commercially-viable markets get
to enjoy. (There is still the problem of users having only older equipment,
though.)

>Why re-invent the wheel? Andrew Cunningham mentioned the
>Private Use Area of Unicode as one alternative. Existing hardware
>and software already provide some support for PUA characters.

You'll have to wait rather longer to see Uniscribe provide rendering
support for PUA characters than for Latin & diacritics.

>As far as exchanging data consistently, perhaps some kind of PUA
>registry for precomposed Latin characters not included in Unicode
>could be established along the same lines as the ConScript Unicode
>Registry.

We will likely do something like that within SIL and some partner
organisations. This has been discussed in relation to OLAC (Open Language
Archive Community as well). I think OLAC is the most appropriate forum for
this.

>Andrew also mentioned custom (8-bit) code pages, which are widely
>used. Lately, people who haven't considered the lack of alternatives
>have taken to criticizing such practicality, calling it "font-hacks" and
>so forth.

I've done it numerous times, and I still do it on occasion. I still call it
a "hack", though, since that's what it is, in many cases at least: The cmap
in TrueType fonts for Windows uses Unicode. People think they're putting
their favourite character on an 8-bit codepoint, but in the font they are
actually hacking with Unicode, breaking conformance rules C6 and especially
C7.

>If the ancient Egyptians had waited for international approval and
>support for their writing system, they'd have left no written
>record of their existence.

And if the designers of the pyramids hadn't told the people in the
quarries, "The blocks shall have exactly these dimensions..."?

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT