Thank you all very much for your kind answers!
My goodness, i should have referenced the thread on the POSIX
mailing list myself, yet i guess it discerns the expert that he
knows about evil character sets without such hints…
Reading your messages it seems safe to request a clarification of
a POSIX wording (Base Definitions, 6.2 Character Encoding; [1]),
from
Likewise, the byte values used to encode <period> and <slash>
shall not occur as part of any other character in any locale.
to
Likewise, the byte values used to encode <period>, <slash>,
<newline> and <carriage-return> shall not occur as part of any
other character in any locale.
[1] <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06>
Of course the ISO C and POSIX facilities are insufficient to deal
with text, portably. (But this theoretical change would turn many
decade-old POSIX programs which test characters against '\n' and
'\r' into functioning software again. By definition, that is.)
P.S.: Wow! I now have an email account nearby the wild Rocky
Mountains! I reckon that's a good place for living. Yay!
--steffen
attached mail follows:
Hello character plus experts,
i'm wondering wether there are any multibyte character sets known
which use the numerical values of ASCII control characters that
are vital to Unix/POSIX (plus) as part of multibyte sequences?
In particular U+000A and U+000D?
Thank you very much in advance (and don't forget to have a nice
weekend, will ya?)
--steffen
Received on Sat Aug 31 2013 - 09:39:56 CDT
This archive was generated by hypermail 2.2.0 : Sat Aug 31 2013 - 09:39:59 CDT