RE: unicode format

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon Feb 23 2004 - 12:19:28 EST

  • Next message: Chris Jacobs: "Re: websites"

    Mark,

    Markus did a good job of describing that advantages of each. The problem that I see is that there are applications that are not enabled to do BOM processing and convert from little-endian to big-endian and the other way around.

    Are there any browsers that support Unicode but will not do endian flips for UTF-16? I usually use UTF-8 to send data between systems just to make sure.

    Carl

    > -----Original Message-----
    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    > Behalf Of Mark Davis
    > Sent: Monday, February 23, 2004 7:17 AM
    > To: steve; John Cowan
    > Cc: unicode@unicode.org
    > Subject: Re: unicode format
    >
    >
    > It is important to distinguish two cases: (a) which UTF one
    > should emit in web
    > pages , (b) which UTF one should use for internal processing.
    > There is a tech
    > note about this at http://www.unicode.org/notes/tn12/
    >
    > Mark
    > __________________________________
    > http://www.macchiato.com
    > ► शिष्यादिच्छेत्पराजयम् ◄
    >
    > ----- Original Message -----
    > From: "John Cowan" <cowan@ccil.org>
    > To: "steve" <steve@appliedlanguage.com>
    > Cc: <unicode@unicode.org>
    > Sent: Mon, 2004 Feb 23 04:50
    > Subject: Re: unicode format
    >
    >
    > > steve scripsit:
    > >
    > > > Could someone please clarify the difference between UTF8 and UFT16
    > > > please? If it is possible to encode everything in UTF8 and it is more
    > > > efficient what is the need for UTF16?
    > >
    > > The short version is that in UTF-8, characters can occupy 1, 2, 3, or
    > > (very rarely) 4 bytes; in UTF-16, characters can occupy 2 or (very
    > > rarely) 4 bytes. Either encoding can be used with any textual content.
    > >
    > > UTF-8 is typically more compact than UTF-16 for English and other
    > > Latin-alphabet languages, slightly more compact for Greek, Cyrillic,
    > > Armenian, Hebrew, and Arabic alphabets, and almost 50% less compact
    > > for everything else.
    > >
    > > --
    > > John Cowan jcowan@reutershealth.com http://www.ccil.org/~cowan
    > > O beautiful for patriot's dream that sees beyond the years
    > > Thine alabaster cities gleam undimmed by human tears!
    > > America! America! God mend thine every flaw,
    > > Confirm thy soul in self-control, thy liberty in law!
    > > -- one of the verses not usually taught in U.S. schools
    > >
    > >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Mon Feb 23 2004 - 13:04:54 EST