From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Feb 26 2011 - 13:09:27 CST
I've not described there multiple bases. BASE is a single integer variable.
There's no "BASE*" defined there for 1-byte encoding in the range 0x00..0x7F.
The use of other bases is possible as an extension (I described it
later when introducing BASE2 as a possible extension for the 2-byte
encoding).
> When a byte starting 11 is used in isolation, why is it represented as 11.yyxxxx please?
>
> Is it because there are four possible values of BASE, namely BASE[0], BASE[1], BASE[2] and BASE[3]?
>
> If BASE has a non-negative value less than 0x80, could that value of BASE be used to signal accessing a decoding tree so that the most common codepoints in the text from beyond the range U+0000 to U+007F could be represented using a single byte starting with 11? The contents of the decoding tree could be dynamically altered using switching codes.
>
> If the idea of four values for BASE, in BASE[0], BASE[1], BASE[2] and BASE[3] is used, then access to a decoding tree would be possible simultanwously with one-byte access to a contiguous block of other Unicode characters if so desired, though if BASE[0], BASE[1], BASE[2] and BASE[3] are used the range of possible values of BASE would need to be 17 bits.
>
> For example, at some particular time in some particular application of the format, BASE[0] might have a value of 0x00 and BASE[1] might have a value of 0x100.
This archive was generated by hypermail 2.1.5 : Sat Feb 26 2011 - 13:14:57 CST