RE: Codepage autodetection, Was FYI: Google blog on Unicode

From: Shawn Steele (Shawn.Steele@microsoft.com)
Date: Tue Feb 09 2010 - 11:14:13 CST

  • Next message: Daniel Goldschmidt: "Worldware conference, Santa Clara CA, March 16-18 2010"

    It's way worse than just differences between ISO & Windows charsets.

    Lots of content (usually older) was posted using tags that weren't appropriate. People made content using their local code page, then posted it without recognizing that there could be differences. Eventually that content/process ended up being tagged incorrectly because the client & server had different ansi code pages, often mis-tagging or mis-advertising stuff as 1252 or something when it wasn't really.

    The good news is that using UTF-8 pretty much "fixes" that problem, and Mark's chart shows UTF-8 is gaining ground. Hopefully in the future even more will be UTF-8 and this will become less of a problem.

    -Shawn



    This archive was generated by hypermail 2.1.5 : Tue Feb 09 2010 - 11:18:28 CST