"Leif H Silli" <xn--mlform-iua_at_xn--mlform-iua.no> wrote:
 |We now have some data that indicates that what Unicode says about the UTF-8 
 |BOM is worded in a way that is possible to misunderstand. I support you in 
Yeah! Yeah! Yeah!, that is good to read black on #FCFCF9.
 |Steven replied:
 |
 |>>In XML 1.0 the BOM is in fact described as a signature regardless of 
 |>> which unicode encoding it is used with:
 |>>
 |>>  |http://www.w3.org/TR/xml/#charencoding
 |>
 |> Yes, simply spoken out and clarified like that, and everybody
 |> knows what to deal with.
 |>
 |> And btw., my local copy of XML 1.1 (Second Edition, thus current)
 |> doesn't include this paragraph (in the referenced 4.3.3):
 |>
 |>   |If the replacement text of an external entity is to begin with
 |>   |the character U+FEFF, and no text declaration is present, then
 |>   |a Byte Order Mark MUST be present, whether the entity is encoded
 |>   |in UTF-8 or UTF-16.
 |
 |I think you must reread. I find the same "signature" sentence in XML 1.1:
 |
 |http://www.w3.org/TR/xml11/#charencoding
 | 
 |> But i don't see the big picture of all that markup standards, i'm
 |> just have them in case my own work raises some questions..
 |
 |We now have some data that indicates that what Unicode says about the UTF-8 
 |BOM is worded in a way that is possible to misunderstand. I support you in 
 |that Unicode should be more explicit about the fact that
 |
 |* it is neutral about the BOM in UTF-8 (currently it is possible to read it 
 |as if Unicode advices against the BOM)
 |
 |* The BOM is a encoding signature - for both UTF-8 and UTF-16.
 |--
 |leif halvard silli 
Received on Mon Jul 30 2012 - 06:00:42 CDT
This archive was generated by hypermail 2.2.0 : Mon Jul 30 2012 - 06:01:12 CDT