Re: Towards a classification system for uses of the Private Use Area

From: Mark Davis (mark@macchiato.com)
Date: Fri Apr 26 2002 - 12:51:34 EDT


I agree that elaborate PUA schemes will not see any public acceptance.
The most practical use we've seen is as a holding area for characters
transcoded from Asian character sets that are not in the Unicode
Standard (they *may* end up in it in the future, or not; or may be
represented as variant sequences (see U3.2).)

That mechanism does allow for round-tripping (at least internally, and
*if* the text is not mixed with PUA characters from other sets). Not
an ideal solution, but one that meets certain needs. See
http://www.unicode.org/unicode/reports/tr22/#Completeness

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Doug Ewell" <dewell@adelphia.net>
To: <unicode@unicode.org>
Cc: "Michael Everson" <everson@evertype.com>; "William Overington"
<WOverington@ngo.globalnet.co.uk>
Sent: Friday, April 26, 2002 08:50
Subject: Re: Towards a classification system for uses of the Private
Use Area

> Michael Everson <everson@evertype.com> wrote:
>
> > The Private Use Area is not to be classified. Anyone anywhere can
use
> > any of its code points for anything.
>
> Furthermore, even if William's scheme is intended to be a
semi-private
> "convention" rather than an official part of Unicode -- much like
> Michael's (and John Cowan's) own ConScript Unicode Registry -- it
seems
> unlikely that an elaborate indexing scheme such as the one William
> proposes would gain much of a following. Vendors have only recently
> started to implement surrogates properly, and still balk at decoding
> SCSU (which is easy; it's the encoding part that gets complex). And
> these official Unicode mechanisms are simple compared to William's
"hex
> point" indexing scheme.
>
> Additionally, it is VERY important to repeat -- probably more
important
> than anything else in this discussion -- that there is no automatic
path
> to "promotion" of any private-use character to full Unicode status.
> Every character and script that is encoded in Unicode must undergo
the
> same scrutiny, regardless of how (or whether) it may have been
encoded
> in the past. That goes for Deseret and Shavian (accepted) as well
as
> Klingon and Aiha (not accepted), all of which were encoded in
ConScript
> but none of which were automatically "promoted" on that basis alone.
>
> There are two full Private Use planes, 131,068 code points in all
(not
> counting the four noncharacters), certainly enough for any
private-use
> implementation that would be envisioned as benefiting from William's
> proposal, and a lot easier to implement (and thus more likely to be
> used). My suggestion to William is that if he envisions a
potentially
> widespread use for the PUA, he may consider creating a
ConScript-like
> registry for the upper planes. That would be just as effective and
much
> simpler.
>
> Apart from font vendors who use the PUA for presentation forms
within
> the font, what current practices exist for using the PUA? I
mentioned
> ConScript; how popular is its use? Are there any other commonly
used
> practices or conventions? Apple has blocked out a code point for
APPLE
> SIGN, and somebody (sorry, don't remember the name) mentioned a
> Microsoft convention of a subarea for symbols or dingbats. Maybe a
> discussion along these lines can reveal the true nature of PUA use
and
> help William redirect his considerable energy toward a more
practical
> system.
>
> -Doug Ewell
> Fullerton, California
>
>
>
>



This archive was generated by hypermail 2.1.2 : Fri Apr 26 2002 - 13:28:57 EDT