Re: UTF-8 BOM (Re: Charset declaration in HTML)

From: David Starner <prosfilaes_at_gmail.com>
Date: Thu, 12 Jul 2012 04:52:43 -0700

On Thu, Jul 12, 2012 at 4:06 AM, Leif Halvard Silli
<xn--mlform-iua_at_xn--mlform-iua.no> wrote:
> I guess you get the same problem with UTF-16 files also, then?

UTF-16 isn't a text file in the Unix world; it's a binary file. UTF-8
is the only standard Unicode encoding that acts like text to a Unix
system, basically because it was designed to act like text to a Unix
system. I'm not personally stressed about UTF-8 BOMs; Unix software
won't produce them, and text from other operating systems has always
taken some massaging to get out of CP1252 and get the newlines right,
and removing the BOM is easy and non-destructive and leaving the BOM
just means you have a few BOMs stuck in your text, which is usually no
big deal.

-- 
Kie ekzistas vivo, ekzistas espero.
Received on Thu Jul 12 2012 - 06:54:08 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 12 2012 - 06:54:08 CDT