Re: Characters for Cakchiquel

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Sat Mar 29 2003 - 05:22:33 EST

Next message: Frank da Cruz: "Missing native-script country names"

Previous message: David Starner: "Re: Characters for Cakchiquel"
Maybe in reply to: David Starner: "Characters for Cakchiquel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

David Starner wrote as follows.

quote

What good would a private use character do here? The private use area is
good for Tengwar, Cirth and Shavian (all of which have multiple fonts
using the same private use area encoding.) But there's no huge demand to
interchange data with these characters, and the few users are probably
going to use something less complex then the private use area. Assuming
I scan this book in for Project Gutenberg, I'll probably use something
like [3], [4], [4,] and [4,h] for the characters, at least in the ASCII
version (and there'd be no reason to post a Unicode version if these
characters aren't in Unicode.) It's simple, readable and precise,
something your solution only has one of.

end quote

The size of the "demand" does not affect my suggestion. It is a matter of
research in encoding scripts.

Using the Private Use Area is not at all complex. If one has a font
containing some Private Use Area characters the font can be used quite
straightforwardly from within a program such as Microsoft Word using the
Insert | Symbol facility which that program provides. Characters can also
be accessed fairly straightforwardly from a program such as some issues of
Microsoft WordPad using an Alt sequence, by holding down the Alt key and
entering a decimal code point value using the number keys at the right of a
PC keyboard, then releasing the Alt key.

Project Gutenberg is a very valuable project. I have recently started
reading the notebooks of Leonardo da Vinci which is available at Project
Gutenberg. For a file using ASCII text, using [3], [4], [4,] and [4,h] for
the characters is probably the only type of method available for the work
and probably quite suitable as it gets the job done within the limits of the
available technology.

However, consider that that file could be processed using a short Pascal
program using a eutocode typography file. The Pascal program would read in
ASCII text and output a Unicode text file, converting certain character
sequences or individual characters.

The particular eutocode typography file for the conversion of the format
which you suggest would only need five lines of a few characters each,
provided that the figures 3 and 4 were only used within the file for those
symbols and not as digits, unless you intend using the [ and ] characters as
well as the digits. However, a Private Use Area encoding of the special
symbols would be needed and a font to display them.

If a Private Use Area encoding is produced for the special symbols used by
missionaries in the fifteenth and sixteenth centuries, then a few fonts,
from various fontmakers, might include the special symbols. Thus your
suggested ASCII file for Project Gutenberg could be used to produce print
outs using the correct special symbols if desired.

The eutocode typography file format is described in the following document.

http://www.users.globalnet.co.uk/~ngo/ast03300.htm

You suggest that using a Private Use Area encoding has only one of the three
attributes of simplicity, readability and precision.

I feel that a Private Use Area encoding could be reasonably simple. Care
could be taken to make it as logically structured as possible within the
limits of an on-going, a bit done now and then type activity, of adding in
code point allocations as symbols are found in the literature. The
experience gained could be useful when promotion to regular Unicode is
considered formally, when the order of encoding used in the Private Use Area
character set could be changed around as desired so as to produce a formal
encoding.

Certainly, without a suitable font readability would be a great problem.
Yet once a Private Use Area encoding is published, font support may follow.

A Private Use Area encoding can be precise provided that both the originator
of a document using the encoding and the user of that document both know
what is the encoding and that both have suitable facilities for applying the
file which contains the document. In such circumstances the Private Use
Area can be of great precision and very useful. For example, readers might
like to have a look at the font COURTCOL.TTF which is described in, and
downloadable from, the following web page.

http://www.users.globalnet.co.uk/~ngo/font7001.htm

Readers might like to have a look at the way in which I have expressed the
colours in monochrome then perhaps search at http://www.yahoo.com and other
search engines using the two words Petra Sancta together for the search.

I have tried some offline experiments with a Java applet and the results are
good. I have also produced a font with 51 glyphs which includes those 19
glyphs and others for four sizes of type, various object replacement
characters, various wait for push button push codes and various markers for
producing a programmed learning package encoded within a text file using
WordPad. Precision is essential for such an activity and the Private Use
Area is used to provide that precision.

Actually, I was rather hoping that the start of a Private Use Area encoding
might be produced by a few interested people fairly quickly, perhaps in this
thread or in some email correspondence. Once that is done, then font
support could gradually be produced.

William Overington

29 March 2003

Next message: Frank da Cruz: "Missing native-script country names"
Previous message: David Starner: "Re: Characters for Cakchiquel"
Maybe in reply to: David Starner: "Characters for Cakchiquel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Mar 29 2003 - 06:18:32 EST