From: Kent Karlsson (kent.karlsson14@comhem.se)
Date: Wed Apr 12 2006 - 14:43:17 CST
Walter Keutgen wrote:
> reading the *draft* standard of which you kindly provided the
ISO has a policy of only making a few (IT) standards freely available.
For the others, only the drafts (up to a point) are freely available.
> link, I can only conclude that Otto's reading is correct.
No, you've been fallen for the same misleading explanation as Otto.
Please read Ken's excellent and much more detailed response than mine.
> See the following quote (copied and pasted):
...
> diacritical MARKS, which are 'no characters' and have
> an encoded representation that may never stand alone, but
> must be followed by a base letter or the space, as
> restricted in the 'repertoire'.
>
> Table 4 defines the character REPERTOIRE
Indeed.
> i.e. the valid combinations.
...of lead byte and tail byte (as well as valid single byte codes).
> But there are contradictions, at least from the usability
> point of view:
>
> In Annex D:
>
> "NOTE 19
> "For spelling the Welsh language correctly, some more letters
...
I'm not sure why they did it that way, but the Welsh letters can be seen
as a "blessed optional extension".
> In 7 bit encoding, escape sequences are necessary, which will
> separate the 'lead byte' from the 'base letter'.
> In my opinion this is a strange property for a precomposed encoding.
No, but using the 7-bit variety *is* strange and cumbersome, and
as far as I know never used.
> The letter sequence 'lead', as in 'lead byte', does not appear in the
text.
No, but that does not change the encoding technically in any way.
> "4.15 repertoire: A specified set of characters that are
> represented by one or more bit combinations of a coded
> "character set.
>
> Why 'or more bit combinations'?
Usually a repertoire has more than one element...
However, reading it more closely to the way you are reading it:
It is not uncommon to have the same character represented
in several different ways (bitwise). As long as one does not
mix the 7- and 8-bit byte based versions of 6937, it does
not apply to 6937.
> The standards begins with a clear, not clumsy, combining
It is highly misleading, and therefore clumsy.
...
> sub-application. Anyway the standard seems however not to be
> released.
Yes it is, published in 2001:
http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=3
1393&ICS1=35&ICS2=40&ICS3=
It is very unlikely to be revised (just reconfirmed), since all ISO
efforts
on character standardisation is focused on ISO/IEC 10646.
> 'Annex C' is rather your opinion, but is marked 'informative'.
Annex C is just a summary of table 4, and as the summary may be
faulty it is just informative. But table 4 is normative. (Besides, I
never
mentioned Annex C in my earlier posts on this thread.)
/kent k
This archive was generated by hypermail 2.1.5 : Wed Apr 12 2006 - 14:47:43 CST