Re: UTF-8 endianness

From: G. Adam Stanislav (adam@whizkidtech.net)
Date: Tue May 18 1999 - 18:01:21 EDT


At 08:22 18-05-1999 -0700, Elliotte Rusty Harold wrote:
>It's my reading of the UTF-8 spec in Appendix A of the Unicode
>specification, that UTF-8 is defined as a sequence of bytes in a particular
>order. A C program that writes correct UTF-8 data produces the same output
>on big and little endian architectures, for example. Is this accurate?

Yes, that is accurate, and that is one of the beauties of UTF-8: It is
unambiguous and looks the same on any system.

You're welcome to examine the code of my libutf-8 if you wish. You can find
it at http://www.whizkidtech.net/i18n/ . You may also want to read RFC 2279.

Adam

---
Want to design your own web counter?
Get GCL 2.10 from http://www.whizkidtech.net/gcl/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT