Re: Unicode conformant character encodings and us-ascii

From: Peter_Constable@sil.org
Date: Fri May 16 2003 - 15:48:51 EDT

Next message: Rick McGowan: "Open public review items..."

Previous message: Kenneth Whistler: "Re: Unicode conformant character encodings and us-ascii"
In reply to: Stefan Persson: "Re: Unicode conformant character encodings and us-ascii"
Next in thread: Philippe Verdy: "Re: Unicode conformant character encodings and us-ascii"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Stefan Persson wrote on 05/16/2003 01:24:35 PM:

> > These might be considered encoding forms, and they might be able to
encode
> > the Unicode coded character set, but I don't think these should be
called
> > "Unicode encoding forms". There are exactly three Unicode encoding
forms:
> > UTF-8, UTF-16 and UTF-32.
>
> Are not BE and LE regarded as different encoding forms, making five
> encoding forms (UTF-8, UTF-16BE, UTF-16LE, UTF-32BE & UTF-32LE)?

No, you are thinking of character encoding *schemes*, of which there are
seven: add to your list "UTF-16" and "UTF-32".

I'll echo Addison's recommendation: read UTR#17 to explain the differences
between the five levels of Unicode's character encoding model:

abstract character repertoire
coded character set
character encoding form
character encoding scheme
transfer encoding syntax

People might also look at Chapter 3 of TUS4.0, the final draft of which is
online at http://www.unicode.org/book/preview/ch03.pdf. In particular,
"encoding form" is defined as D29, "encoding scheme" is defined as D38, and
the specific encoding forms and schemes *defined by Unicode* (take note,
Philippe) are defined in the surrounding pages.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485

Next message: Rick McGowan: "Open public review items..."
Previous message: Kenneth Whistler: "Re: Unicode conformant character encodings and us-ascii"
In reply to: Stefan Persson: "Re: Unicode conformant character encodings and us-ascii"
Next in thread: Philippe Verdy: "Re: Unicode conformant character encodings and us-ascii"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri May 16 2003 - 16:37:27 EDT