Software support costs (was: Nicest UTF

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Fri Dec 10 2004 - 10:58:11 CST

  • Next message: Antoine Leca: "Re: When to validate?"

    Philippe,

    > Also a broken opening tag for HTML/XML documents

    In addition to not having endian problems UTF-8 is also useful when tracing
    intersystem communications data because XML and other tags are usually in
    the ASCII subset of UTF-8 and stand out making it easier to find the
    specific data you are looking for.

    However, within the program itself UTF-8 presents a problem when looking for
    specific data in memory buffers. It is nasty, time consuming and error
    prone. Mapping UTF-16 to code points is a snap as long as you do not have a
    lot of surrogates. If you do then probably UTF-32 should be considered.

    From a cost to support there are valid reasons to use a mix of UTF formats.

    Carl



    This archive was generated by hypermail 2.1.5 : Fri Dec 10 2004 - 11:00:15 CST