Re: Unicode conformant character encodings and us-ascii

From: Peter_Constable@sil.org
Date: Fri May 16 2003 - 13:43:24 EDT

  • Next message: Peter_Constable@sil.org: "Re: Decimal separator with more than one character?"

    Philippe Verdy wrote on 05/15/2003 11:08:19 AM:

    > Don't forget other Unicode encoding forms: UTF-7, BOCU and SCSU...

    These might be considered encoding forms, and they might be able to encode
    the Unicode coded character set, but I don't think these should be called
    "Unicode encoding forms". There are exactly three Unicode encoding forms:
    UTF-8, UTF-16 and UTF-32.

    > Unicode only defines codepoints, not their serialization into code
    > units and not technical aspect such as byte order

    Not true. Issues of code units, byte order and serialization are not
    relevant in relation to the Unicode coded character set, but the Unicode
    Standard does include specifications for three encoding forms and seven
    encoding schemes in which these things are most definitely defined.

    - Peter

    ---------------------------------------------------------------------------
    Peter Constable

    Non-Roman Script Initiative, SIL International
    7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
    Tel: +1 972 708 7485



    This archive was generated by hypermail 2.1.5 : Fri May 16 2003 - 14:30:19 EDT