Re: Private Use areas

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Tue, 21 Aug 2018 21:08:49 +0100

On Tue, 21 Aug 2018 11:03:41 -0700
Ken Whistler via Unicode <unicode_at_unicode.org> wrote:

> On 8/21/2018 7:56 AM, Adam Borowski via Unicode wrote:

> Really? Suppose someone wants to implement a bicameral script in PUA.
> They would need case mappings for that, and how would those be
> "better represented in the font itself"? Or how about digits? Would
> numeric values for digits be "better represented in the font itself"?
> How about implementation of punctuation? Would segmentation
> properties and behavior be "better represented in the font itself"?

The least intrusive way of defining the meaning of a graphic (sensu
lato) character is by a font, in a very wide sense that would interpret
a Unicode code chart as a font. Without a font in this sense, normal
characters in the PUA have no meaning. If one insists on a font to
have an interpretation, then:

(1) PUA characters in plain text are meaningless - I believe that's
pretty much the position now.

(2) Different schemes can co-exist, even within the same formatted
document, by having different formats. This is the case now. It then
makes sense to store the properties in the font, which needs to be
saved with or in the document for the document to continue to make
sense.

Casing and digits are luxuries. Are we not told that searching should
be done by collation? We then do not need case-folding! Interpreting
the preferred representation of Roman numerals does not use Unicode
properties beyond the approximate principle of one character, one
codepoint.

As to segmentation, my understanding was that there were no characters
available to indicate word boundaries in scriptio continua; the closest
one has is line-breaking suggestions. If my memory serves me right,
SIL Graphite fonts can hold line-breaking information.

Richard.
Received on Tue Aug 21 2018 - 15:09:12 CDT

This archive was generated by hypermail 2.2.0 : Tue Aug 21 2018 - 15:09:12 CDT