On Sun, 25 Nov 2001, Philipp Reichmuth wrote:
AA> On the contrary i have seen thousands os applications and web
AA> pages using Unicode for Chinese and Japanese inspite of these
AA> language scripts requiring large use of the Unicode encoding
AA> space.
As a Korean whose script takes up an unnecessarily huge chunk of
code space with precomposed syllables in BMP and has many features in
common with Brahmi-derived scripts, would you reconsider your 'conspiracy
theory' if I tell you that I strongly believe that ISCII and Unicode
have done a much better job with Indic scripts than Korean standard body
did with Korean script?
Assigning precomposed characters or presentation forms (like half-form)
their own code points may have helped some more web pages than otherwise
be put up using technology available in mid-90's. However, as others
including Philipp have told you many times, it's like buying a short-term
advantage of being able to get away with a quick and dirty trick (
proprieatary fonts and incompatible encoding varieties ) at the expense
of being flexible, extensible, amenable to natural language processing,
suitable for DB application and many other advantages.
PR> In theory, it would have been possible to do this for Chinese or
PR> especially Korean as well, but there it would be even more complicated
PR> to implement a font, and data would have become a lot larger as
Not only in theory but also in practice, Koeran has been moving
in that direction using Conjoining Jamos at U+1100 because that's
the *only* viable way to represent Korean script without any artificial
restriction imposed by state-of-the art of 1990's. Perhaps a few decades
from now people may consider 1990's as a peculiar period in terms of
Korean script representation in computer. (In pre-KS C 5601-1987 standard,
'jamos' were used to represent syllables with SI and SO to toggle between
US-ASCII and Korean)
PR> to implement a font, and data would have become a lot larger as
Designing good looking fonts for Korean certainly takes a lot of
effort, but that difficulty has become less to do with the way Korean
script is represented in storage with the advent and wide deployment of
OT and other smart font/rendering technologies. Data size is for sure a
concern, but I think it'll be less of an issue compared with significant
gains of going that way as time goes by (party due to things like GMR
and CMR :-) among other things )
PR> compared to unicode if one had to encode each character by radicals
Compared to Unicode?? Encode each character by radicals?? For Chinese
characters, you're right because radical-based-encoding is not a part
of Unicode/10646 (yet) and might never be, but for Korean, Unicode/10646
does have provisions for representing Korean scripts using *consonants and
vowels* instead of syllables. That's what I'm talking about in the above.
Jungshik Shin
This archive was generated by hypermail 2.1.2 : Sun Nov 25 2001 - 23:05:25 EST