From: Markus Scherer (markus.icu@gmail.com)
Date: Mon Feb 04 2008 - 11:47:42 CST
Most Unicode software and libraries use UTF-16 internally, which is easy to use.
Some use UTF-8 even internally, if they see a large majority of
high-volume text in ASCII.
UTF-32 as a string encoding is rare. (Some people call single-code
point integers "in UTF-32".)
Roll your own encoding form, and you can't use any existing libraries... Why?
markus
On Feb 4, 2008 5:49 AM, Hans Aberg <haberg@math.su.se> wrote:
> I think that 32-bit is probably best for internal use in programs for
> speed, avoiding alignment problems; the best way to actually know is
> to do some profiling. Externally, for distributed files, UTF-8 seems
> best, because most agree on how to sort out the bits the bytes.
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.
This archive was generated by hypermail 2.1.5 : Mon Feb 04 2008 - 11:51:25 CST