Re: Unicode String Models

From: Henri Sivonen via Unicode <unicode_at_unicode.org>
Date: Thu, 13 Sep 2018 08:08:19 +0300

On Wed, Sep 12, 2018 at 11:37 AM Hans Åberg via Unicode
<unicode_at_unicode.org> wrote:
> The idea is to extend Unicode itself, so that those bytes can be represented by legal codepoints.

Extending Unicode itself would likely create more problems that it
would solve. Extending the value space of Unicode scalar values would
be extremely disruptive for systems whose design is deeply committed
to the current definitions of UTF-16 and UTF-8 staying unchanged.
Assigning a scalar value within the current Unicode scalar value space
to currently malformed bytes would have the problem of those scalar
values losing information whether they came from malformed bytes or
the well-formed encoding of those scalar values.

It seems better to let applications that have use cases that involve
representing non-Unicode values to use a special-purpose extension on
their own.

-- 
Henri Sivonen
hsivonen_at_hsivonen.fi
https://hsivonen.fi/
Received on Thu Sep 13 2018 - 00:08:57 CDT

This archive was generated by hypermail 2.2.0 : Thu Sep 13 2018 - 00:08:58 CDT