"Dreiheller, Albrecht" <albrecht.dreiheller_at_siemens.com> wrote:
|In this context, it might be useful to know that there are some codepoints
|in some Chinese multi-byte encodings, which contain a byte looking like
|a Backslash "\" 0x5C as trail byte.
|This can cause problems in C-like string literals where \ acts as a meta-character.
|
|Examples:
|
|in BIG5 (Win CP 950) Traditional Chinese
|U+03B1 maps to A3 5C
|U+4E48 maps to A4 5C
|U+4FDF maps to AB 5C
|
|in GBK (Win CP 936) Simplified Chinese
|U+2010 maps to A9 5C
|U+2558 maps to A8 5C
|U+4E57 maps to 81 5C
Thank you – well of course it is, for every very hungry caterpillar.
--steffen
attached mail follows:
From: Steffen Daode Nurpmeso, Saturday, August 31, 2013 4:37 PM
> Likewise, the byte values used to encode <period>, <slash>,
> <newline> and <carriage-return> shall not occur as part of any
> other character in any locale.
In this context, it might be useful to know that there are some codepoints
in some Chinese multi-byte encodings, which contain a byte looking like
a Backslash "\" 0x5C as trail byte.
This can cause problems in C-like string literals where \ acts as a meta-character.
Examples:
in BIG5 (Win CP 950) Traditional Chinese
U+03B1 maps to A3 5C
U+4E48 maps to A4 5C
U+4FDF maps to AB 5C
in GBK (Win CP 936) Simplified Chinese
U+2010 maps to A9 5C
U+2558 maps to A8 5C
U+4E57 maps to 81 5C
Received on Thu Sep 05 2013 - 10:13:24 CDT
This archive was generated by hypermail 2.2.0 : Thu Sep 05 2013 - 10:13:25 CDT