Specification of Encoding of Plain Text from Richard Wordingham on 2017-01-09 (Unicode Mail List Archive)

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Mon, 9 Jan 2017 22:24:14 +0000

Where, if anywhere, is the encoding of plain text specified? I am
particularly concerned with the arrangement of the code sequences for
non-spacing abstract characters once one has determined an encoding for
the abstract characters.

For example, a naive reading of TUS 9.0 Section 16.4 Subsection
"Ordering of Syllable Components" would lead one to believe that the
word _khnyom_ 'I' shall be encoded as <U+1781 KHMER LETTER KHA,
U+17D2 KHMER SIGN COENG, U+1789 KHMER LETTER NYO, U+17BB KHMER VOWEL
SIGN U, U+17C6 KHMER SIGN NIKAHIT>. However, on further investigation,
I cannot find any text that says that <U+1781, U+17C6, U+17D2, U+1789,
U+17BB> would not be compliant with the Unicode standard. Have I
missed anything?

One might hope that the subsection about 'logical order' in TUS 9.0
Section 2.2 Unicode Design Principles would help, but:

1) Section 3 'Conformance' says nothing about logical order; and
2) The subsection about 'logical order' seems to assume that there
exists a common practice; it does not actually place any requirement
on this common practice.

Richard.
Received on Mon Jan 09 2017 - 16:25:03 CST

This archive was generated by hypermail 2.2.0 : Mon Jan 09 2017 - 16:25:03 CST