2011/9/12 Christoph Päper <christoph.paeper_at_crissov.de>
> Delex,
>
> you are obviously confusing character sets, scripts, writing systems,
> orthographies, languages, peoples and names thereof (which may vary across
> languages and applications).
>
> NB: Some might argue that Unicode already distinguishes Indic scripts on a
> finer level than necessary, since elsewhere many would be seen as hands or
> typefaces of a single script, hence they would unify encoding and leave the
> looks to fonts completely.
>
This was the ISCII approach... unifying lots of scripts with a common
encoding, but still many exceptions for all of them. It would have been much
more complicate to support the various conventions used to determine where
to split vowels, where and when to place them, where to generate half-forms
or halant-forms... And it would haver required the use of a contextual
script specifier in a format control (like in ISCII)...
Really it was the bad way to go, ISCII was notoriously difficult to
implement without first disambiguating the characters as if they belonged to
separate codepages. Codepage swithing used in ISCII was similar to ISO 2022
used in East-Asian charsets and other legacy 8-bit encodings, not really an
approach for the UCS.
Well, wasn't the ISCII standard naming the script "Bengali"? It also gave
the name "Assamese", but was it a synonym or did it require a separate
codepage switching code ? It may be interesting to reread the ISCII standard
from which the UCS encoding of the Indian scripts came from...
Received on Mon Sep 12 2011 - 11:18:02 CDT
This archive was generated by hypermail 2.2.0 : Mon Sep 12 2011 - 11:18:03 CDT