Re: _Unicode_code_page_and_?.net from Steffen on 2013-07-30 (Unicode Mail List Archive)

From: Steffen <sdaoden_at_gmail.com>
Date: Tue, 30 Jul 2013 22:50:52 +0200

Asmus Freytag <asmusf_at_ix.netcom.com> wrote:
|On 7/30/2013 12:26 PM, Doug Ewell wrote:
|> Buck Golemon <buck at yelp dot com> replied to Richard Wordingham
|> <richard dot wordingham at ntlworld dot com>:
|>
|>>>> There are no Unicode code pages.
|>>> Just to be pedantic, there are several on Windows. They encode the
|>>> coding form (Unicode codes being best thought of as an assignment of
[…]
|> Most Windows .NET developers who are concerned about proper character
|> handling would know this information existed, though they might not have
|> the numbers memorized.
|>
|> Jukka was right, though: Unicode itself does not have code pages.
|> Rather, at least one vendor has defined some of the Unicode encoding
|> schemes as if they were code pages. A code page is not, in general, the
|> same as an encoding scheme.
|What is, then, the proper definition of a "code page"?
|
|When Unicode was first introduced, it was seen as the one thing that
|wasn't a "code page", because the way the Win32 API associated one of
|the traditional code pages with Unicode (giving rise the "A" and "W"
|versions of all the APIs).
|
|Later, it was realized that in order to specify what encoding data were
|in or, for example, to specify a conversion from UTF-7 and UTF-8 to
|UTF-16 (native encoding scheme) one needed some suitable ID number to
|identify the mapping. Well, extending the code page id was the most
|natural way to do that, because, on several platforms, the use of a
|numerical ID from the IBM code page registry was established practice.

IANA however records „MIBenum“ values for that purpose:

The MIBenum value is a unique value for use in MIBs to identify
coded character sets.

See [1] (but also RFC 2278, section 3.7).

[1] <http://www.iana.org/assignments/character-sets>

MIB enums values:

  Reserved
    0 - 2
  Set By Standards Organizations
    3 - 999
  Unicode / ISO 10646
    1000 - 1999
  Vendor
    2000 - 2999

|A./

--steffen
Received on Tue Jul 30 2013 - 15:52:40 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 30 2013 - 15:52:41 CDT