Re: Is the binaryness/textness of a data format a property?

From: Markus Scherer via Unicode <unicode_at_unicode.org>
Date: Sun, 22 Mar 2020 11:56:52 -0700

On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode <unicode_at_unicode.org>
wrote:

> I thought the whole premise of GB18030 was that it was Unicode mapped into
> a GB2312 framework. What characters exist in GB18030 that don't exist in
> Unicode, and have they been proposed for Unicode yet, and why was none of
> the PUA space considered appropriate for that in the meantime?
>

My memory of GB18030 is that its code space has 1.6M code points, of which
1.1M are a permutation of Unicode. For the rest you would have to go beyond
the Unicode code space for 1:1 round-trip mappings.

Just please don't call it UTF-8.

markus
Received on Sun Mar 22 2020 - 13:57:36 CDT

This archive was generated by hypermail 2.2.0 : Sun Mar 22 2020 - 13:57:38 CDT