From: Hans Aberg (haberg@math.su.se)
Date: Wed Apr 15 2009 - 03:00:48 CDT
On 12 Apr 2009, at 23:05, Philippe Verdy wrote:
> Actually there's a differnce between the two encoding schemes:
> - UTF-8 assumes that "bytes" can contain at least 8 significant bits
> and it
> assigns specific meaning to the 8th bit, but does not assume
> anything for
> possible extra bits that be left used after the 8 lowest bits in the
> same
> adressable unit of memory (a byte is not necessarily 8-bit wide;
> think about
> it as if we hd used the term "code unit" for "byte"; in fact two
> bytes may
> also not be separated by 1 increment of addressable memory, because
> 1-bit
> memory also exists, even if, today, mst systems have adopted alignment
> constraints for blocks of succesive bits making a single byte).
> - ASCII just assumes that "bytes" can contain at least 7 significant
> bits
> (but it indicates absolutely nothing for code units that are not in
> the
> range 0 to 127.
In fact, computer standards that require exactly 8 bits may use the
word "octet"
http://en.wikipedia.org/wiki/Octet_(computing)
The link says the word is commonly used in France, among other places.
Hans
This archive was generated by hypermail 2.1.5 : Wed Apr 15 2009 - 03:06:14 CDT