David Hopwood wrote:
> Below '#' is used to quote from the Unicode 3.2 standard as proposed
> in PDUTR #28, and '>' is used to quote my suggested changes.
I second David's thourough, and clearly presented, contribution.
However, I have to suggest one minor improvement:
> Conformance clauses
...
> This is what I think clauses C5 and C10 should be:
...
> > C10 A process shall make no change in a valid code sequence other
> > than the possible replacement of character sequences by their
> > canonical-equivalent sequences, if that process purports not to
> > modify the interpretation of that code sequence.
...
> > - Changing the bit or byte ordering when transforming between different
> > machine architectures does not modify the interpretation of the text.
I consider the bit ordering a hardware issue, invisible to the programmer
or the end-user; hence, I'd not mention it in this note.
W.r.t. the byte ordering, this note does apply only to UTF-16 and UTF-32
with
a BOM.
It does not apply to UTF-8, as this format implies a particular byte
ordering.
Neither does it apply to UTF-16LE, UTF16-BE, UTF32-LE, UTF32-BE; rather,
swapping the byte-order, in any one of these formats, amounts to trans-
forming to a different UTF, viz. UTF-16BE, UTF16-LE, UTF32-BE, and UTF-32LE,
respectively.
> > - Transforming to a different Unicode Transformation Format does not
> > modify the interpretation of the text.
Hence, I propose the following wording for the last two notes on the
proposed C10 clause:
| - Changing the byte ordering of a string encoded in either UTF-16,
| or UTF-32, when a Byte Order Mark is present, does not modify the
| interpretation of the text.
|
| - Transforming to a different Unicode Transformation Format does not
| modify the interpretation of the text. This includes transformations
| between Unicode Transformation Formats that only differ by their
| respective byte ordering, such as a transformation from UTF-16BE
| to UTF-16LE (irrespective, whether the byte-ordering is explicitely
| specified, or is implied by the target environment the string is
| ported to).
I hope I could make my suggestion clear; improvements of my wording are
certainly possible, as I am not a native speaker of English.
Best wishes,
Otto Stolz
This archive was generated by hypermail 2.1.2 : Thu Feb 21 2002 - 04:14:33 EST