Re: Origins of ẘ

From: Asmus Freytag <>
Date: Sun, 15 Apr 2012 22:04:26 -0700

On 4/15/2012 7:30 PM, Rick McGowan wrote:
> > At Wiktionary, we're looking at ẘ (U+1E98) and
> > we can't figure out where it came from.
> Good catch. It's obviously another stowaway...
> Just throw it in the brig until we can get around to deporting it.
The 1E00 and 1F00 blocks were populated, in Unicode 1.1 by rejects from
Unicode 1.0 that were re-admitted as part of the merger with ISO/IEC
10646. If you have anyone with access to the early (paper only) meeting
documents of WG2, you might, just might, find a source for them.

Most of these characters were "rejected" because they were unnecessary -
they are easily encoded as combining sequences and there were no legacy
character sets that needed them precomposed for 1:1 roundtrip
compatibility. WG2 and Unicode (before the merger) had different
standards on what compatibility characters were required.

(There were some gaps in these blocks after the initial population of
characters were added in Unicode 1.1. These were later filled with more
solid candidates, so the "age" of each character is an important clue here).

Stowaway is an apt term - because the characters did not add anything
new (they could already be encoded as combining sequences) and because
normalization would remove them from the data stream, nobody tried very
hard to fine-tune the set and as a result risk the failure of the
merger. Ideal conditions for "stowaways" to enter hiding in the crowd.

Received on Mon Apr 16 2012 - 00:09:47 CDT

This archive was generated by hypermail 2.2.0 : Mon Apr 16 2012 - 00:09:48 CDT