Re: HTML5 encodings

From: Christoph Päper (christoph.paeper@crissov.de)
Date: Tue Dec 22 2009 - 02:34:06 CST

Next message: Otto Stolz: "Re: Medievalist ligature character in the PUA"

Previous message: Peter Krefting: "Re: HTML5 encodings (was: Re: BOCU patent)"
In reply to: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Next in thread: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

verdy_p:
>
> The question of charsets is really the least complex one to handle
> in a browser,

Using one right is not a problem, choosing the right one is.

On the Web there is a number of different places to declare an
encoding, contradicting defaults, byte-identic standards, supersets
and subsets, inflexible server software, differing database backends,
misleading coding software, bad advice, unskillful developers, copy
and paste solutions, workarounds, hacks for minority scripts, old
unmaintained content, mashes of applications, heuristics close to
magic and people as stupid as ever.

Ever wondered why browsers still provide the possibility to select an
encoding manually?

> by violating rules that were validated and tested for XML and past
> versions, the new prohibition will just create more problems than
> what it will solve, because it simply violates the intended target
> which was "compatibility with legacy applications"

Do you know any content currently on the Web encoded in a way
prohibited by HTML5?

If encodings are currently unused or unsupported, provide only
insignificant advantages but potentially significant disadvantages
(esp. security-wise), it can be a sound choice to not use them
altogether.

> because C1 controls were in theory forbidden in HTML and XML...
> except the NEXT LINE control inherited from EBCDIC and mapped at
> 0x85 in ISO-8859-1 and part of compressible whitespaces and of line
> separators,

Where do you get that from?

> Is HTML5 already a dead standard,

Quite the contrary, XHTML2 is dead. As much as I would love a lean,
modular, systematic, well-designed text and application markup
language for the Web (which XHTML1 is not and XHTML2 would not have
been either) in theory, the pragmatic course taken by WHATWG is prone
to succeed in practice.

HTML5 has an "XML serialization" by the way.

> in fact the battle is not there: it is in the evolution of
> stylesheets, i.e. CSS3 where we should be more interested to have
> it support a better typography.

Typography (i.e. styling the 'inscription' itself) is not the main
focus of Level 3, though. Most of it is done in but two modules: CSS3
Text and CSS3 Fonts. <http://dev.w3.org/csswg/css3-text/> <http://
dev.w3.org/csswg/css3-fonts/> (Editor's Drafts)

> What I really hope is that browser will prefer violating the stupid
> HTML5 rules,

Do you know who established and who funds WHATWG? (Google is now a
browser maker, too.)

> Who suggested these violation rules? All seems to indicate Microsoft,

They partially were suggested because of, but not by MS.

> as it really looks inspired by existing standard violations found
> in IE,

Indeed, in the real world it is often vital to mimic even the
failures of the top dog to stay alive, but even if surpassing it at
some place in time there hardly is a path back. This largely has
already happened in the browser world (including the handling of
character encoding), but is now publicly documented in HTML5. This
actually makes it easier for new players to catch up.

> The more I read the HTML5 proposal, the more I see problems in it.

It's an open and not a finished spec, you know.

> The violations adopted on purpose are really a big hint to alert
> others: don't use it, keep HTML4 or go directly to XHTML.

Ouch, you really have no idea what you are talking about here.
Besides, the encoding issue is really not that important as basically
everyone can be assumed to be using UTF-8 now or soon.

Next message: Otto Stolz: "Re: Medievalist ligature character in the PUA"
Previous message: Peter Krefting: "Re: HTML5 encodings (was: Re: BOCU patent)"
In reply to: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Next in thread: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Dec 22 2009 - 02:35:35 CST