Unicode encoding policy

Asmus Freytag asmusf at ix.netcom.com
Wed Dec 24 00:08:32 CST 2014

On 12/23/2014 1:51 PM, Doug Ewell wrote:
> William_J_G Overington <wjgo underscore 10009 at btinternet dot com>
> wrote:
>> 5. Are the proposed characters in current use by the user community?
>> No
>> ----
>> This appears to be a major change in encoding policy.
>> This, in my opinion, is a welcome, progressive change in policy that
>> allows new characters for use in a pure electronic technology to be
>> added into regular Unicode without a requirement to first establish
>> widespread use by using an encoding within a Unicode Private Use Area.
> It is exactly the change I was worried about, the precedent I was afraid
> would be set.

Requiring long-term use of characters at an alternate code location 
always struck me as counter-productive, because it becomes disruptive at 
the point where some character finally has been established. In contrast 
to true "experimental" use.

Therefore, recognizing that for some code points there can be critical 
mass of implementation support straight from the moment of publication 
is useful.

This is definitely not the same as saying that any idea, however 
half-baked, of a new symbol should be encoded 'on-spec' to see whether 
it garners usage.

The "critical mass" of support is now assumed for currency symbols, some 
special symbols like emoji, and should be granted to additional types of 
symbols, punctuations and letters, whenever there is an "authority" that 
controls normative orthography or notation.

Whether this is for an orthography reform in some country or addition to 
the standard math symbols supported by AMS journals, such external 
adoption can signify immediate "critical need" and "critical mass of 
adoption" for the relevant characters.

In these case, to require years of PUA code usage is, to repeat, 
counterproductive. It doesn't alter the fact that the codes will 
eventually be needed (unless one were to confidently expect failure of 
some reform) and only leads to the creation of data in the meantime that 
have to be converted or cannot be accessed reliably.

A clear-cut recognition by the UTC (and WG2) of this particular dynamic 
(beyond currency codes) would be helpful -- particularly as Unicode has 
matured to the point of being the only game in town. The current 
methodology of researching typeset data is well suited to the encoding 
of existing or historic practice, but ill-suited to dealing with ongoing 
development of scripts and symbol sets.

Taking this new stance makes it easier to contrast it with hobbyists, 
enthusiasts and individual tinkerers attempts at inventing a better 
world through symbols or new letters. These latter cases lack both 
"critical need" as well as "critical mass" unless they are first adopted 
by much larger (and/or more authoritative) groups of users.

There is an inherent risk that large groups of users can follow "fads" 
that require certain symbols that see huge usage for a while and then 
get abandoned. While this is hard to predict, it is not that different 
from historical changes in writing systems - even if the trends there 
played out over longer time frames.

>> I feel that it is now therefore possible to seek encoding of symbols,
>> perhaps in abstract emoji format and semi-abstract emoji format, so as
>> to implement a system for communication through the language barrier
>> by whole localizable sentences, with that system designed by
>> interested people without the need to produce any legacy data that is
>> encoded using an encoding within a Unicode Private Use Area.
> Sadly, I can no longer state with any confidence that such a proposal is
> out of scope for Unicode, as I tried to do for a decade or more.
> --
> Doug Ewell | Thornton, CO, USA | http://ewellic.org
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode

More information about the Unicode mailing list