Re: Towards a classification system for uses of the Private Use Area

From: Kenneth Whistler (kenw@sybase.com)
Date: Sun Apr 28 2002 - 18:06:47 EDT


William Overington continued:

> Then, in order to apply the classification system to any plain text file,
> the file needs to contain some classification characters near the start.
>
> For a file using the Egyptian hieroglyphics characters, the following
> sequence would be needed.
>
> U+F35B U+F333 U+F330 U+F330 U+F331 U+F35D

Why not simply insert the following text at the top of the file or page:

"This text includes PUA characters which require the use of the
 font XXX.ttf for proper display, accessible at http://xxx/yyy/"

This accomplishes everything you indicate below, without the need
for a semi-standard agreement on U+FE.. characters, without any
"findpuac.ttf" font required to read the labelling, and without
any need for maintenance of a "type tray" registry someplace that
people would have to apply to to get their type tray identification
code.

> Suppose then that one day someone comes across a plain text file and within
> that plain text file are character codes from the Private Use Area and that
> person has no idea as to which character set those character codes may be
> intended to represent.

Problem obviated by the alternative approach.

>
> So, the person looks at that file using a word processing program and
> chooses to use a specially made fount named findpuac.ttf (that is, the find
> private use area classification fount) which has all characters as zero
> width except for the eighteen characters in the U+F3.. block which I
> mentioned in my previous posting, those eighteen characters being
> implemented in the findpuac.ttf fount as having analysis glyphs as detailed
> in my previous posting.

Unneeded in the alternative approach.

> The screen display gives a code of C001 which the
> person can look up in a web based reference list

Unneeded in the alternative approach.

> and there finds out that it
> is in fact a particular character set for cuneiform characters. The web
> based reference list contains a link to a website from which the person
> downloads a copy of a special fount that contains the cuneiform characters.

Directly readable in the original document using the alternative approach.

> The plain text file is then displayed using that fount.

Same in both approaches.

> That fount has the
> eighteen characters in the U+F3.. block which I mentioned as being zero
> width, so that they do not affect the display at all when the file is
> displayed.

Unnecessary in the alternative approach.

> So, I suggest that the system is not too complex at all to implement and
> use.

The existence of a *much* simpler alternative approach indicates to me
that it is indeed too complex, and that the availability of the obvious
alternative will preclude heading in this direction.

--Ken



This archive was generated by hypermail 2.1.2 : Sun Apr 28 2002 - 18:54:09 EDT