From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Fri Apr 07 2006 - 02:07:30 CST
Hello Tay, William,
you have asked:
> Can accented characters be decomposed in other encodings, e.g. ISO
> 8859-1, as well?
The title of the ISO 88591 series contains the term "single-byte coded
graphic character sets". The use of control functions for the coded
representation of composite characters is prohibited by ISO 8859,
and there are no combining, or non-spacing (cf. infra), characters
defined.
An exception from this rule probably is ISO 8859-6 "Latin/Arabic
alphabet". In my copy of 1987 (there may be a newer edition,
I haven't checked it), the clause about prohibiting composition
of characters is missing, and it defines 8 Arabic marks that
normally are composing (such as Fatha, Damma, Kasra). However,
the 1987 version of that standard is rather vague about the
composing/rendering issue.
ISO 6937 has been an approach to large character sets by heavy
use of composition. Quote from ISO 6937/2-1983:
> Each accented letter or umlaut is represented by a sequence
> of bit combinations consisting of the coded representation
> of the relevant non-spacong diacritical mark [...], followed
> by the coded representation of the relevant basic Latin letter
> [...]
Best wishes,
Otto Stolz
This archive was generated by hypermail 2.1.5 : Fri Apr 07 2006 - 02:11:34 CST