Why is "endianness" relevant when storing data on disks but not when in memory? from Costello, Roger L. on 2013-01-05 (Unicode Mail List Archive)

From: Costello, Roger L. <costello_at_mitre.org>
Date: Sat, 5 Jan 2013 22:21:01 +0000

Hi Folks,

In the book "Fonts & Encodings" it says (I think) that endianness is relevant only when storing data on disks.

Why is endianness is not relevant when data is in memory?

On page 62 it says:

    ... when we store ... data on disk, we write
    not 32-bit (or 16-bit) numbers but series of
    four (or two) bytes. And according to the
    type of processor (Intel or RISC), the most
    significant byte will be written either first
    (the "little-endian" system) or last (the
    "big-endian" system). Therefore we have
    both a UTF-32BE and a UTF-32LE, a UTF-16BE
    and a UTF-16LE.

Then, on page 63 it says:

    ... UTF-16 or UTF-32 ... if we specify one of
   these, either we are in memory, in which case
    the issue of representation as a sequence of
    bytes does not arise, or we are using a method
    that enables us to detect the endianness of the
    document.

When data is in memory isn't it important to know whether the most significant byte is first or last?

Does this mean that when exchanging Unicode data across the Internet the endianness is not relevant?

Are these stated correctly:

When Unicode data is in a file we would say, for example, "The file contains UTF-32BE data."

When Unicode data is in memory we would say, "There is UTF-32 data in memory."

When Unicode data is sent across the Internet we would say, "The UTF-32 data was sent across the Internet."

/Roger
Received on Sat Jan 05 2013 - 16:28:11 CST

This archive was generated by hypermail 2.2.0 : Sat Jan 05 2013 - 16:28:14 CST