Re: UCS-2/4 & BOM

From: Hans Aberg (haberg@math.su.se)
Date: Thu Jun 02 2005 - 06:30:46 CDT

Next message: Antoine Leca: "Re: Ligatures fi and ffi"

Previous message: Andrew West: "Re: Ligatures fi and ffi"
In reply to: Theo Veenker: "UCS-2/4 & BOM"
Next in thread: Markus Scherer: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 12:04 +0200 2005/06/02, Theo Veenker wrote:
>If someone sends me a text file marked charset=ISO-10646-UCS-2
>or charset=ISO-10646-UCS-4, should an initial BOM in this file have
>the same meaning as a BOM in UTF-16/32?
>
>In other words are UCS-2 and UCS-4 character encoding schemes?
>Or do the UCS-2/4 only exist as CEFs UCS-2BE, UCS-2LE, UCS-4BE
>and UCS-4LE.

Those who use the BOM, encode the Unicode code point (i.e., character
number), so its encoded binary value will vary from encoding to
encoding. Its use is not required any Unicode standard, though it is
mentioned.

-- 
   Hans Aberg

Next message: Antoine Leca: "Re: Ligatures fi and ffi"
Previous message: Andrew West: "Re: Ligatures fi and ffi"
In reply to: Theo Veenker: "UCS-2/4 & BOM"
Next in thread: Markus Scherer: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jun 02 2005 - 06:32:47 CDT