RE: UTF-8 signature in web and email

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Wed May 16 2001 - 05:15:26 EDT

Next message: Michael \(michka\) Kaplan: "Re: UTF-8 signature in web and email"
Previous message: Marco Cimarosti: "RE: Ancient writing found in Turkmenistan"
Maybe in reply to: Roozbeh Pournader: "UTF-8 signature in web and email"
Next in thread: Mark Davis: "Re: UTF-8 signature in web and email"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Keld Jørn Simonsen wrote:
> For UTF-8 there is no need to have a BOM, as there is only one
> way of serializing octets in UTF-8. There is no little-endian
> or big-endian. A BOM is superfluous and will be ignored.

Not so. In plain text, it is a useful signature to distinguish UTF-8 from
other things. See the 3rd question in
<http://www.unicode.org/unicode/faq/utf_bom.html>.

The three bytes EF BB BF is hardly confused with a meaningful sequence in
existing encodings. The only (unlikely) example I know is a couple of Hangul
syllables in UTF-16.

However, as we are talking about text whose encoding is already identified
(e-mail, web), it is in fact quite superfluous to have a signature at all.

But, then, this is superfluous also for other UTF's: what's the purpose of
using an endianness-ambiguous MIME specification (e.g. "UTF-16") and a BOM
to disambiguate it? Isn't it simpler to use an unambiguous specification in
the first place (e.g. "UTF-16BE" or "UTF-16LE")?

BTW, I understand that BOM is just a nickname now: the character has been
renamed as "ZERO WIDTH NO-BREAK SPACE".

_ Marco

Next message: Michael \(michka\) Kaplan: "Re: UTF-8 signature in web and email"
Previous message: Marco Cimarosti: "RE: Ancient writing found in Turkmenistan"
Maybe in reply to: Roozbeh Pournader: "UTF-8 signature in web and email"
Next in thread: Mark Davis: "Re: UTF-8 signature in web and email"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT