From: Neil Harris (neil@tonal.clara.co.uk)
Date: Thu Jan 27 2011 - 14:57:28 CST
On 27/01/11 18:34, Doug Ewell wrote:
> Neil Harris<neil at tonal dot clara dot co dot uk> wrote:
>
>> I wonder if the best approach to take to this class of proposals would
>> be to point their proponents in the direction of the Private Use Area,
>> and documentation on how to make their own fonts in freely
>> distributable formats, and then invite them to come back when they
>> have a large community of real-world users using their new writing
>> system to exchange plain-text messages for non-artificial purposes, at
>> which point they could then apply to go through the normal process for
>> encoding?
> That is the right approach to take for writing systems. That is not the
> right approach to take for entities that have nothing to do with writing
> systems.
>
> There is no shortage of information items, and classes of information
> items, that lend themselves to being encoded in some way. Software
> developers concern themselves with this every single day. There are
> weather forecasts and dog breeds and security-badge access levels and
> Mozart string quartets, all of which can be assigned a code element of
> some sort. That does not make these items characters, and it does not
> make it appropriate to encode them in a character encoding standard, any
> more than it would be appropriate to encode Cyrillic letters in the
> Köchel-Verzeichnis.
I think the problem here is that _conceptually_ defining the exact
boundary between real-world writing systems and idealistic hypothetical
systems is very hard.
For example, Blissymbols could be regarded as a novel writing system,
and yet may well, like SignWriting, almost certainly belong inside the
grey area of meeting the requirements -- and there are certainly various
sorts of weirdness already encoded within the Unicode system for
historical reasons. Emoji is just the most boundary-blurring of all
recent cases.
Compared to other proposals, systems such as William's have two major
added complications:
1) wanting every possible reasonable utterance to have its own
character, which certainly doesn't, if only because the number of
possible combinations for even simple sentences vastly outstrips the
number of available codepoints by many orders of magnitude.
2) desiring them to be encoded before actual usage, rather than the
other way around
the combination of which seems to me to offer an insuperable obstacle to
meeting the current rules, or indeed any other foreseeable practically
implementable rules that conserve the Unicode code point space for the
future.
Nevertheless, I think it's difficult to reject them as a class _in
principle_, no matter how impractical they might be in practice, because
of the difficulty of defining the fuzzy line between practical and
impractical schemes: the problem is a matter of degree and practicality,
not a matter of kind.
I think that the general approach of taking these sorts of proposals at
face value, and being willing at least in principle to consider full
formal proposals on a case-by-case basis under the existing rules, with
the burden of proof on the proposer, is the right way to go, with your
invitation to William to submit a full proposal being a good example.
Even if they end up falling at the first hurdle when submitted, being
invited to generate a proposal, and thus needing to take a long hard
look at the details of how these systems might be implemented, will
hopefully concentrate the minds of the proposers of this sort of scheme.
I rather look forward to reading William's proposal, if and when he
presents one.
However, I think it would still be worth a FAQ entry explaining why
these classes of schemes are going to face exceptional difficulties in
meeting the requirements for character encoding...
-- Neil
This archive was generated by hypermail 2.1.5 : Thu Jan 27 2011 - 14:59:24 CST