Running out of code points, redux (was: Re: Feedback on the proposal...)
Richard Wordingham via Unicode
unicode at unicode.org
Thu Jun 1 16:39:12 CDT 2017
On Thu, 01 Jun 2017 12:54:45 -0700
Doug Ewell via Unicode <unicode at unicode.org> wrote:
> Richard Wordingham wrote:
> > even supporting 6-byte patterns just in case 20.1 bits eventually
> > turn out not to be enough,
> Oh, gosh, here we go with this.
You were implicitly invited to argue that there was no need to handle
5 and 6 byte invalid sequences.
> What will we do if 31 bits turn out not to be enough?
A compatible extension of UTF-16 to unbounded length has already been
designed. Prefix bytes 0xFF can be used to extend the length for UTF-8
by 8 bytes at a time. Extending UTF-32 is not beyond the wit of man,
and we know that UTF-16 could have been done better if the need had
While it seems natural to hold a Unicode scalar value in a single
machine word of some length, this is not necessary, just highly
In short, it won't be a big problem intrinsically. The UCD may get a
bit unwieldy, which may be a problem for small systems without Internet
More information about the Unicode