Re: DBCS and Unicode 3.1

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Feb 17 2003 - 19:07:12 EST

  • Next message: Doug Ewell: "Re: DBCS and Unicode 3.1"

    Michael (michka) Kaplan wrote:
    > Well, DBCS means "double byte character set" and thus it is always two
    > bytes. But its a theoretical definition since there are no actual DBCS
    > code pages -- all of the ones that exist are MBCS (multibyte character
    > set) since they support both one-byte and two-byte characters.

    More or less. There are systems that use pure DBCS, for example in databases, to get a fixed-width
    encoding. Usually those DBCS codepages are the double-byte portions of some MBCS codepages though.

    > There are standards like the Chinese GB18030 which supports characters
    > of 1, 2, or 4 bytes -- definitely MBCS again.

    Other examples: There are EUC-JP (1/2/3 bytes per character) and EUC-CN (1/2/4 BpC) which are quite
    "old" (much older than GB 18030).

    > But these code pages are generally owned by outside
    > governments/agencies, so there is no rule that they need to update
    > when Unicode does.

    Right. No one codepage _has_ to be upgraded just because another one is.

    markus

    -- 
    Opinions expressed here may not reflect my company's positions unless otherwise noted.
    


    This archive was generated by hypermail 2.1.5 : Mon Feb 17 2003 - 19:42:59 EST