Re: QBCS

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Sep 02 2003 - 19:04:38 EDT

  • Next message: Philippe Verdy: "Re: Cyrillic character mapping tables, HP MSL to Unicode"

    From: "Asmus Freytag" <asmusf@ix.netcom.com>
    > At 08:26 PM 9/1/03 -0700, Doug Ewell wrote:
    > >Tex Texin <tex at i18nguy dot com> wrote:
    > >
    > > > In most industry usages, MBCS refers to variable width encodings,
    not
    > > > fixed width.
    > >
    > >Well, if variable-width encodings are referred to as both DBCS (see,
    for
    > >example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then
    what
    > >term is used to describe a fixed-width encoding of more than 1 byte?
    Or
    > >was the concept not common enough to warrant a name until Unicode?
    >
    > The most common 'pure' DBCS was encountered in mainframe environments.
    > All the other platforms used 'mixed' single and double-byte or other
    > variable length encodings, so that 'DBCS' could stand in for a
    variable
    > lenght encoding with maximum length 2 without confusion (except when
    > talking to mainframe people).

    In the late 80's, the acronym DBCS was also used to refer to
    user-defined characters, that could be assigned in a codepage and
    defined by a transferable bitmap, and accessed with an encoding sequence
    allowing you to remap the upper-half of the 8-bit character set.

    In a 7-bit environment, these 8-bit "characters" (in fact relative
    positions in a 7-bit codepage) could be accessed using control sequences
    (like SS2 used to shift temporarily in the upper subset only for the
    next character). For these reasons, those assigned characters in the
    selected codepage for the upper-half of the 8-bit encoding, and accessed
    by at least 2 encoding 7-bit bytes were qualified as "double-byte
    character", and the general encoding scheme was called "DBCS".

    This has inspired the ISO-2022 standards for East-Asian languages, but
    also the European Teletext and Videotext standard, then restricted to a
    7-bit encoding scheme. These systems are still used today. But in any
    case the "DBCS" usage was refering to a complex encoding scheme with
    variable length for characters (and sometimes varying with the encoding
    context or exceeding the 2 bytes limit). You may find references to
    these character sets with also reference to special escape sequences
    used to define and transport the bitmaps needed to represent
    "user-defined" characters (as they were defined notably to support
    Japanese or Chinese in the late 80's, or to create custom graphic
    characters, in fact bitmap glyphs, within interactive documents or
    applications).



    This archive was generated by hypermail 2.1.5 : Tue Sep 02 2003 - 20:06:11 EDT