RE: UTF-8 text samples

From: Murray Sargent (murrays@microsoft.com)
Date: Thu Oct 15 1998 - 15:41:38 EDT


Donald's UTF-8 file should begin with a UTF-8 BOM in order to identify it as
a UTF-8 encoded file. The starting bytes should be 0xEF 0xBB 0xBF. These
bytes are discarded when reading the file in and added when writing the file
out.

Thanks
Murray

> -----Original Message-----
> From: Donald Page [SMTP:donaldp@sco.com]
> Sent: Thursday, October 15, 1998 10:25 AM
> To: Unicode List
> Subject: Re: UTF-8 text samples
>
> The above attachment should contain all of the Minimum European Subset
> encoded as UTF-8. I created it for my own testing, but feel free to use
> it.
>
> Donald
>
> On Thu, 15 Oct 1998, Frank da Cruz wrote:
>
> > Can anybody tell me where to find some UTF-8 text samples? Preferably
> > containing mainly characters from the U+0000 through U+27FF range.
> >
> > Thanks!
> >
> > - Frank
> > << File: >>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT