Re: I18N of HTML - Hebrew

From: Martin J Duerst (mduerst@ifi.unizh.ch)
Date: Tue May 07 1996 - 13:33:25 EDT


Jonathan Rosenne answered Gavin Nicol:

>>The reason for making the BIDI formatting codes markup is that we
>>wanted to make the functionality they offer available in encodings
>>other than Unicode (the I18N draft does not require Unicode support
>>for anything other than numeric character reference resolution).
>
>Just make them named character entities. And this will help not only
>with the bidi characters, but also with the ZWJ and ZWNJ and many
>other characters.

As I have tried to explain in my previous (longer) mail, the problem is
more complex than that, and not solved by just making them
named character entities.

>In principle, each of the ISO 8859-x codes covers at most 192
>characters, so there are some 59,808 that require a symbolic name (assuming ca.
>60,000 characters in UCS-2 -).

There are quite some positions not yet assigned, and the Korean block
that can be done automatically once you have a scheme. But how, and
to what effect, do you assign symbolic names to the CJK ideographs?

Anyway, there are other groups working on a list of SGML names for
these characters, such as as the ECRC endeavour, some people at Unicode,
and of course the national standard bodies. Anyway, asking for support
for all these entities in all browsers is too early. They can always
be entered with numeric character references. And if somebody has
the time and energy to issue another internet draft giving this list,
(s)he is very wellcome :-).

>The internet draft seems to associate ZWJ and ZWNJ with bidi. While
>they are useful in Arabic, and to a lesser extent in Hebrew, they are not bidi
>characters.

It's in the same chapter, but apart from that, it should be clear that
they are not directly related. Anyway, in this area, we can only tell
the non-experts that these things are needed, and the experts how
to do it (for which they don't need much introductory info), which
might give the impression that the draft is missing something.
But we cannot repeat much of Unicode in the draft, somebody
who wants to become an expert has to do this on his/her own.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT