Re: Charset declaration in HTML

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Wed, 11 Jul 2012 14:15:39 +0200

2012/7/11 Jean-François Colson <jf_at_colson.eu>

> If your document only contains
>
> <?php
> header("location:http://unicode.org");
> ?>
>
> but you save it with a BOM, the BOM will be sent and you’ll get an error
> message like
>
> Warning: Cannot modify header information - headers already sent by
> (output started at /customers/0/1/f/colson.eu/httpd.www/test.php:1) in
> /customers/0/1/f/colson.eu/httpd.www/test.php on line 2
>
> (tested with Firefox 13.0 and Google Chrome 20.0.1132.47 on Ubuntu 12.04.)
>

Most probably a bug to signal to PHP. That leading BOM in the PHP source
file should be silently ignored. But the file is probably not read as being
effectively encoded as UTF-8. Did you try to indicate the source encoding
with a commmand line flag when starting PHP ? The BOM should avoid passing
the source encoding as a flag. Another way would be to start PHP from a
console environment where UTF-8 is in the initial locale.

Note that the behavior of PHP highly depends on which SAPI was preselected
when it was compiled.

Similar consoderations should be done with the encoding of source files in
other languages (including C, C++, Java, C#, shell scripts...) if they
allow various encodings for the source files : either recognizing the
leading UTF-8-encoded BOM automatically as meaning UTF-8 or allowing an
external flag or parameter to specify that UTF-8 is used in the sources
(and in that case having BOM's also recognized and filtered out at least at
the leading position of any source file).

Anyway, I've never used any PHP source files that needed somthing else than
7-bit ASCII in them. Translatable items only consisted in external resource
files, including for the static HTML part generated by the PHP scripts, but
for some very limited uses, such as encoding some constant separators, I
usually define a symbolic constant defined with a numeric entity form. The
main reason for that is exactly because I want to have scripts that will
run correctly independantly of the SAPI and of the host system, or that
will autoadapt to this runtime environment (and posibly to different
comlation settings of the PHP engine itself, which has lots of options set
at compiler time or in INI files or processed and filtered out by the SAPI
interface).
Received on Wed Jul 11 2012 - 07:17:44 CDT

This archive was generated by hypermail 2.2.0 : Wed Jul 11 2012 - 07:17:45 CDT