"Dreiheller, Albrecht" <albrecht.dreiheller_at_siemens.com> wrote:
 |In this context, it might be useful to know that there are some codepoints
 |in some Chinese multi-byte encodings, which contain a byte looking like
 |a Backslash "\" 0x5C as trail byte.
 |This can cause problems in C-like string literals where \ acts as a meta-character.
 |
 |Examples:
 |
 |in BIG5 (Win CP 950) Traditional Chinese 
 |U+03B1 maps to A3 5C
 |U+4E48 maps to A4 5C
 |U+4FDF maps to AB 5C
 |
 |in GBK  (Win CP 936) Simplified Chinese
 |U+2010 maps to A9 5C
 |U+2558 maps to A8 5C
 |U+4E57 maps to 81 5C
Thank you – well of course it is, for every very hungry caterpillar.
--steffen
attached mail follows:
From: Steffen Daode Nurpmeso, Saturday, August 31, 2013 4:37 PM
>  Likewise, the byte values used to encode <period>, <slash>,
>  <newline> and <carriage-return> shall not occur as part of any
>  other character in any locale.
In this context, it might be useful to know that there are some codepoints
in some Chinese multi-byte encodings, which contain a byte looking like
a Backslash "\" 0x5C as trail byte.
This can cause problems in C-like string literals where \ acts as a meta-character.
Examples:
in BIG5 (Win CP 950) Traditional Chinese 
U+03B1 maps to A3 5C
U+4E48 maps to A4 5C
U+4FDF maps to AB 5C
in GBK  (Win CP 936) Simplified Chinese
U+2010 maps to A9 5C
U+2558 maps to A8 5C
U+4E57 maps to 81 5C
Received on Thu Sep 05 2013 - 10:13:24 CDT
This archive was generated by hypermail 2.2.0 : Thu Sep 05 2013 - 10:13:25 CDT