From: Hans Aberg (haberg@math.su.se)
Date: Thu Jan 20 2005 - 18:52:37 CST
On 2005/01/20 20:10, Addison Phillips [wM] at aphillips@webmethods.com
wrote:
>> The BOM in UTF-8 is not the 0xFEFF UTF-8 encoded number, but 0xFEFF
>> appearing as though in UTF-16. 0xFEFF is Unicode number, and
>> could be still
>> translated into UTF-8. So the BOM in UTF-8 is a really strange animal.
>
> I hesitate to feed the thread, but what the heck.
>
> This is confusingly written, but I believe it is wrong.
Yes, I misunderstood that one.
> The Unicode scalar value (for the BOM character) is U+FEFF. In UTF-8 this is
> encoded as the byte sequence:
>
> 0xEF 0xBB 0xBF
>
> This is the byte sequence that Notepad writes at the start of UTF-8 files
> saved from that editor.
So they say.
> Given all the misinformation on this thread, I direct your attention to the
> FAQ:
>
> http://www.unicode.org/faq/utf_bom.html#BOM
Thanx for the pointer.
Hans Aberg
This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 18:54:40 CST