Re: Courtyard Codes and the Private Use Area

From: William Overington (
Date: Sat May 25 2002 - 10:44:55 EDT

Response to the comments of Mr Philipp Reichmuth.

Thank you for your response to my posting of the Courtyard Codes.

>While I don't think this discussion of various PUA allocations should
>continue very further, it's probably a lot better to introduce the
>already-discussed ZERO WIDTH LIGATOR in such a form that X ZWL Y
>produces the XY ligature, X ZWL Y ZWL Z the XYZ ligature and so on. It
>saves you a lot of hassle with longer ligatures.

The ZERO WIDTH LIGATOR has, in my opinion, considerable merit.

However, it is a matter upon which the Unicode Technical Committee has ruled
and there is now an official method along the same lines using an extended
definition of the ZERO WIDTH JOINER.

>Does this belong in a character-based encoding system at all? This is
>better solved by markup. If you go on defining your own file formats
>already, do include some sensible markup system there, and you don't
>have to clutter the PUA and restrict their use. What if you've got
>more than 2 swash forms, BTW?

There is, in my opinion, far too much emphasis on making systems either
character based or markup as two distinct, rigidly separated, categories. I
feel that there is scope for a limited amount of markup within Unicode
itself, not anything as comprehensive as HTML, but a small amount of general
guidance codes, such as in these Courtyard Codes is, I feel, both reasonable
and desirable.

I am not cluttering the Private Use Area. The Private Use Area is provided
in order that it may be used. I am using the Private Use Area in accordance
with the Unicode specification.

I am not restricting the use of the Private Use Area, I am simply asking end
users whether they will please consider agreeing to something, entirely in
accordance with the Unicode specification, chapter 13, section 13.5. No one
is obliged to agree.

If there are more than two swash forms, then, at the present state of the
Courtyard Code collection, third and further swash forms would not be
accessible. If readers do have any knowledge of cases where more than two
swash forms for a particular character exist, then such information would be
welcome. If there is a need, I will happily add facilities for accessing
such swash characters.

>WO> U+F3C0 PLAIN - ITALIC:=false; BOLD:=false;
>WO> ...
>Again, markup is the better solution. And, to be honest, it's a bit of
>a waste of space on the mailing list, don't you think?

Markup may well be a better solution in some circumstances. However,
consider please a situation where someone is seeking to have plain text
where there is just one or two words in italics. I feel that the existence
of just the need to put those one or two words in italics should not then
force the issue of not being able to use an essentially plain text format
and instead having to go for a less universally portable proprietary format
that is only accessible by people who are using one particular computer
platform using expensive add on software from a commercial software company.
I feel that it is reasonable that people using widely accessible plain text
formats should be able to use a few general formatting features, such as

Publishing Courtyard Codes on the mailing list is not a waste of space. The
mailing list provides a direct email link to many people who are directly
interested in Unicode and its applications and indirect access to anyone who
chooses to look up the web based echo of the list. I feel that it is highly
likely that many readers of the list will save a copy of that posting and
file it somewhere, perhaps thinking out the implications for their own use
of Unicode, whether they be a representative of a software package
developer, a developer of founts for minority languages, an individual
interested in transcribing historical documents, an author seeking to send a
manuscript to a magazine editor or an author considering "publish on demand"

The concept of the Courtyard Codes is potentially very far reaching for the
future application of Unicode. It is possible that the classification codes
and the formatting codes could be promoted to regular Unicode. There will
need to be some debate about whether the power exists
to promote the classification codes, yet I feel that that power does exist,
there is, I feel, a clear difference between endorsing a particular
allocation in the Private Use Area and providing a set of classification
codes in regular Unicode: the former could not be endorsed by the Committees
and any particular use of the classification codes could not be endorsed by
the Committees yet I suggest that the non-endorsement rule would not be
broken by regular Unicode having classification codes available in regular
Unicode which end users could then use as they please, neither needing nor
being able to obtain any official endorsement of any particular use of those
classification codes. I feel that regular Unicode providing those
classification codes would not be a breaching of the non-endorsement rule
just as regular Unicode stating where the Private Use Areas are located or
stating which surrogates can be used to access the Private Use Areas that
are up in the mountains on planes 15 and 16 are not breaches of the
non-endorsement rule. I feel that the rules are such that the Committees
could promote all of the characters in the Courtyard Codes collection if
they so choose.

>WO> I hope that these Courtyard Codes will be of interest to end users.
>I don't really think so. They don't offer very much that well-known
>typesetting systems don't implement already in their own fashion.

The Courtyard Codes are not purported as offering facilities that cannot be
achieved with well-known typesetting systems or with well-known proprietary
file formats. The Courtyard Codes do offer the prospect of being able to
have access to some of those facilities from within simpler, more generally
available software packages such as might be available on the internet and
on cover discs of magazines. Also, the Courtyard Codes may well have good
application in education where, being part of a limited set, they might
perhaps be used as a specification for a text displaying program that
students are to write using Java as a coursework exercise.

William Overington

25 May 2002

This archive was generated by hypermail 2.1.2 : Sat May 25 2002 - 09:09:50 EDT