PILCH Hartmut wrote on 1999-08-17 09:47 UTC:
> Removing alphabetic widechars will strengthen the case of those in Japan,
> who claim that asian writing systems can't be unified and that Unicode is
> a straight-jacket.
In such discussions (I participated in many of these before), it is
important to get your Japanese discussion partners to acknowledge the
clear difference between
- real Asian writing system requirements
- existing coding practice (what does EUC and ISO 2022 do)
Some of them tend to mix up these two horribly, which is a reason for
much of the confusion surrounding Unicode in the Japanese IT world.
> On the other hand, wide alphabetic characters cause a lot of confusion
> among ordinary users even in East-Asia, who use them inadvertently, e.g.
> for naming files. But that is rather a question of input method design.
Some of the Japanese legacy encodings have a number of practical dangers
and Unicode offers cleaner solutions here. These dangers are usually
related to abusing an encoding tagging mechanism such as ISO 2022 to
also encode language information or to abuse the character encoding
mechanism to also encode presentation style information. Text processing
and information retrieval becomes much simpler if character encoding,
language tagging and style tagging are kept strickly separate and
orthogonal mechanisms, and are not mixed together. Unicode is only about
character encoding, and not about language tagging or presentation style
selection, which is what confuses Japanese users occasionally, who are
used to work with ugly hacks like ISO 2022-JP that stir all three
concepts together.
Summary:
If you design a Unicode system such that ISO 2022 and EUC users feel
home immediately, then chances are that you made something wrong. :-)
Markus
-- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT