From: Rick Cameron (Rick.Cameron@businessobjects.com)
Date: Thu Aug 18 2005 - 15:44:23 CDT
By its nature, UCS-2 text will not contain any characters with scalar value greater than U+FFFF. UCS-2 is a strict subset of UTF-16, so using the converter for UTF-16 to UTF-8 will work.
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Samuel Thibault
Sent: Thursday, 18 August 2005 13:11
To: Magda Danish (Unicode)
Cc: unicode@unicode.org
Subject: Re: FW: Subj: Converting from UCS-2 to UTF-8
Magda Danish (Unicode), le Thu 18 Aug 2005 12:42:21 -0700, a écrit :
> But it's not clear to me that I can use any of these programs for UCS-2. I am aware that UTF-16 and UCS-2 are almost identical, but it's the "almost" that worries me. Can you confirm that the converter from UTF-16 to UTF-8 will work for converting from UCS-2 to UTF-8 without any loss or corruption of data?
It will only work if the text doesn't contain unicode characters
starting from U+10000.
But converting from UCS-2 to UCS-4 is really easy: just append two
\0 bytes after each 2-bytes character on a little endian machine, or
before each 2-bytes character on a big endian machine. Then, since
UCS-4==UTF-32, you can use the UTF32-UTF8 converter.
Regards,
Samuel
This archive was generated by hypermail 2.1.5 : Thu Aug 18 2005 - 15:45:58 CDT