Re: Split a UTF-8 multi-octet sequence such that it cannot be unambiguously restored?

From: Doug Ewell via Unicode <unicode_at_unicode.org>
Date: Mon, 24 Jul 2017 15:35:43 -0700

J Decker wrote:

> I generally accepted any utf-8 encoding up to 31 bits though ( since
> I was going from the original spec, and not what was effective limit
> based on unicode codepoint space)

Hey, everybody: Don't do that.

UTF-8 has been constrained to the Unicode code space (maximum U+10FFFF,
four bytes) for almost fourteen years now.

--
Doug Ewell | Thornton, CO, US | ewellic.org

Received on Mon Jul 24 2017 - 17:36:38 CDT

This archive was generated by hypermail 2.2.0 : Mon Jul 24 2017 - 17:36:38 CDT