From: karl williamson (public@khwilliamson.com)
Date: Fri Jan 29 2010 - 10:44:45 CST
Mark Davis ☕ wrote:
> FYI, they managed to use the larger image before most people saw it.
>
> Mark
>
>
>
> On Fri, Jan 29, 2010 at 07:06, Mark Davis ☕ <mark@macchiato.com> wrote:
>> It is encodings determined by a detection algorithm. The declarations
>> for encodings (and language) are far too unreliable to be depended on.
>> The detection algorithm itself is fairly complex, but quite fast and
>> compact.
>>
>> Mark
>>
>>
>>
>> On Thu, Jan 28, 2010 at 21:38, Simon Montagu <smontagu@smontagu.org> wrote:
>>> On 28/01/2010 10:50, Mark Davis ☕ wrote:
>>>> There's a blog on Unicode that people may find interesting:
>>>> http://googleblog.blogspot.com/2010/01/unicode-nearing-50-of-web.html
>>>>
>>>> (The graph on Unicode is too small; until they get that fixed, I have
>>>> the large one on http://www.macchiato.com/)
>>>>
>>>> Mark
>>> What exactly is this counting? Encodings declared internally in web-pages?
>>> Encodings declared in HTTP headers? Encodings determined by auto-detection?
>>> Some combination of the above?
>>>
>>> --
>>> Simon Montagu
>>> Mozilla internationalization
>>> סיימון מונטגיו
>>>
>>>
>
>
Since ASCII is a proper subset of utf8, this means effectively that 2/3
of the web is using utf8; up from about 57% in 2001. So the sum of the
two has a much shallower slope.
Since the two are distinguished, I'm guessing that many more web pages
have at least one non-ascii character on them than there used to be??
This archive was generated by hypermail 2.1.5 : Fri Jan 29 2010 - 10:48:20 CST