It's my reading of the UTF-8 spec in Appendix A of the Unicode
specification, that UTF-8 is defined as a sequence of bytes in a particular
order. A C program that writes correct UTF-8 data produces the same output
on big and little endian architectures, for example.  Is this accurate?  Or
can the bytes get swapped anywhere on different platforms?
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|               Java I/O (O'Reilly & Associates, 1999)               |
|            http://metalab.unc.edu/javafaq/books/javaio/            |
|   http://www.amazon.com/exec/obidos/ISBN=1565924851/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java news:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML news: http://metalab.unc.edu/xml/     |
+----------------------------------+---------------------------------+
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT