Re: Encoding of symbols and a "lock"/"unlock" pre-proposal

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Tue May 21 2002 - 09:30:01 EDT

Previous message: Peter_Constable@sil.org: "text analysis software tools"
Next in thread: i18nGuy Tex Texin: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: i18nGuy Tex Texin: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: Thomas Chan: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: Peter_Constable@sil.org: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Yes, I feel that it is worth putting forward a proposal for the open and
closed padlock symbols, yet wonder if I may make mention that maybe the
words should be "unlocked" and "locked" as adjectives rather than "unlock"
and "lock" as imperative verbs.

Surely, a padlock is either unlocked or locked, so that the symbols indicate
the state in which a system now exists. This then raises the question as to
whether there should be symbols for "unlock" and "lock" as imperative verbs,
such that those symbols would indicate where to click so as to change from
being in an unsecure state to a secure state or from being in a secure state
to an unsecure state. This then gets into the fact that with a padlock one
needs a key to unlock it but one does not need a key to lock it, yet using a
key symbol to mean unlock would seem to go against the way that computer
systems are organized in that a key might seem more naturally to mean lock,
notwithstanding that one does not need a key to lock a padlock.

So, a suggestion, how about the following four symbols. I have added some
Private Use Area allocation suggestions so that anyone who might like to
have a go at producing a fount can have something to try which will be
self-consistent within this discussion. Readers who read my suggestions
regarding the Private Use Area may perhaps observe that these present
suggestions are within the range U+F300 through to U+F3FF and that I have
already suggested some of the codes in the lower half of this range in the
thread "Towards a classification system for uses of the Private Use Area".
I have added these symbols into the upper half of the range as I consider
them to be very fundamental uses which should, as far as possible, be
uniquely used within the Private Use Area. I know that absolute unique
usage is not possible as all code point allocations within the Private Use
Area are non-exclusive, yet, as I am using the U+F300 through to U+F3FF
range as the place to suggest widely useful facilities that will hopefully
be widely used both to classify other uses of the Private Use Area and as
general facilities, I am hopeful that people who are using the Private Use
Area for various purposes might feel that avoiding overlapping definitions
within the U+F300 through to U+F3FF range is beneficial to everybody.
Hopefully, the making of an allocation within the U+F3.. region will be
regarded as an indication of the importance that I attach to the provision
of these padlock characters.

My suggestions are as follows. These are open for discussion. I am open to
the possibility of changing them in the light of discussions.

U+F380 PADLOCK SHOWN UNLOCKED, INDICATING "UNLOCKED"
U+F381 PADLOCK SHOWN LOCKED, INDICATING "LOCKED"
U+F382 PADLOCK SHOWN UNLOCKED WITH DOWNWARD ARROW ON TIP OF ROD, INDICATING
"UNLOCKED, CLICK HERE TO LOCK"
U+F383 PADLOCK SHOWN LOCKED WITH UPWARD ARROW ON ROD, INDICATING "LOCKED,
CLICK HERE TO UNLOCK"

I have used the word rod to indicate the piece of the padlock which is like
a curved piece of rod. If there is a special name for that part of a
padlock, then the name of the symbol can be changed.

I have put unlocked before locked so that the final bit is a Boolean state
that indicates whether the padlock is locked, as that seems to me to be the
natural way to look at the symbols.

On the more general question of encoding symbols, I feel that there is great
scope, using the classification scheme that I suggested previously, for
encoding symbols that are not presently, or might never, be encoded in
regular Unicode using the Private Use Area. Certainly, the Unicode
Consortium will not, by the rules, ever endorse any particular allocations,
yet I feel that there is nonetheless scope for end users to try to make
progress by having lists of symbols so that all sorts of things that might
be useful get a, well, I won't say "standardized" code point, but a code
point that at least may have some usefulness that is more than zero
usefulness. I note with interest the way that, in the structure of internet
newsgroups, there is a largely informal process of starting a new newsgroup
in the alt.* newsgroup hierarchy, which, whilst not producing a newsgroup in
the regular Usenet hierarchy, nevertheless does produce newsgroups where
meaningful discussions can and do take place. On a parallel, I feel that
there is scope to have a type tray as I would call it from U+E000 through to
U+EFFF where up to 4096 symbols could be allocated code points as time goes
on and that list, as it develops, be available on the web. Almost on the
basis of if anyone wants a code point within that type tray allocated to a
particular symbol then they can have it unless there is some good reason why
not, such as for example that the symbol is already coded elsewhere, either
in that type tray or in regular Unicode, or that as people start to try to
make allocations, some sort of ordering evolves amongst users and people
suggesting something might be given a reason why some alternative location
would be better. I am not suggesting that this process in any way leads to
exclusive allocations, for indeed it would not, yet I feel that it could be
very useful to end users of the Unicode system.

My view is that the Unicode system provides the Private Use Areas as a
valuable resource, it is there for end users to use as they wish. If that
means that people act in a normal everyday civilised manner towards each
other and that such normal everyday civilised behaviour leads to people
being able to use code points for symbols that are not regular Unicode on a
more or less universal uniquely defined basis then that is good, the fact
that everybody could if they choose argue for ever over which code point
means which symbol and everybody could disregard any suggestion that is not
regular Unicode unless they are party to a one to one agreement with the
sender of a file as to what such a code point is meant to mean does not mean
that that sort of behaviour is in fact going to happen amongst civilised
people who are interested in making good use of computers. For example,
consider the letter E in Unicode. Now, the fact of which code point is used
to represent a letter E is not an issue for me. What interests me is that
if I enter a letter E here on a keyboard in England that someone reading
this has a letter E on his or her screen so that he or she understands the
meaning which I am trying to convey. So, if the symbol is a road sign or an
emoticon or some other symbol, then I suggest that, within the limits of
making the system as coherent as possible within the context of adding code
points here and there as time progresses, then maybe it does not matter too
much which code point means which symbol, the important thing is that the
system exists and is consistent, so that if a document has a U+E023 code in
it, then someone has at least a reasonable chance of finding out what it
might mean.

So, I suggest that maybe a 4096 code point block is used as a type tray,
from U+100000 through to U+100FFF with it being duplicated on a temporary
basis as U+E000 through to U+EFFF so that, while 16 bit Unicode is more
available and that codes in the range U+100000 through to U+100FFF might
cause display problems, codes in the range U+E000 through to U+EFFF can be
used so as to get round those problems, yet with a view to the long term
place being in the U+100000 through to U+100FFF range. This could be quite
an experiment, it will be interesting to observe what happens. If nothing
else, people might work out which surrogate codes are needed to represent
code points in the range U+100000 through to U+100FFF and gain some useful
experience from doing so. As time goes on, maybe some of the symbols will
get promoted to regular Unicode status.

An interesting point is as to the fact that symbols in Unicode are single
colour, without the colour being specified. Is that essential for a Private
Use Area type tray? Maybe each symbol needs to be monochrome and there be
matched pairs, each of which could be any colour and there be default
colours defined. For example, in the days of metal type there used to be
holly ornaments, 24 points square, where three types were available, one
with leaves and berries, one with just leaves and one with just berries.
The sort with leaves and berries might end up printed in whatever colour,
say on a one colour handbill about a dance, yet the best printing results
were with two colour printing on, say, menus for a restaurant, where firstly
all of the menus would be printed with green ink so as to print the leaves
and then, when the ink had dried, all of the menus would be printed with red
ink so as to print the berries. With care, and care was indeed needed, one
could get a good two colour effect of green holly leaves with red berries
amongst them, without having the red and green areas overlap. Suppose that
it was desired to encode such a matched pair of symbols within Unicode. Is
it possible? Does it need that a special operator is defined to step back
by the character width so that the next symbol is superimposed, or can that
be done with some facility already defined in regular Unicode for some other
purpose? Is there any feature within a TrueType fount that could be used to
let an appropriately programmed piece of software have knowledge of an
intended default colour for a symbol? I feel that there are various matters
of this nature which could quite reasonably be explored, without adversely
affecting the fact that regular Unicode is for plain text without a default
colour or style being indicated.

On another aspect of this, I have in mind the possibility that some of the
codes in the range U+F3C0 to U+F3FF could be allocated so as to have effects
that are traditionally regarded as mark up. For example, ITALICS and NOT
ITALICS and so on. This would enable some mark up style effects to be used
in essentially plain text files. I feel that this could be advantageous as
one could then have files that are highly portable without being locked into
one or another software company's proprietary file format while also having
some markup capabilities. If people who are interested in this possibility
would like to post any suggestions then perhaps a useful set of code points
can be produced.

William Overington

21 May 2002

Previous message: Peter_Constable@sil.org: "text analysis software tools"
Next in thread: i18nGuy Tex Texin: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: i18nGuy Tex Texin: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: Thomas Chan: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Reply: Peter_Constable@sil.org: "Re: Encoding of symbols and a "lock"/"unlock" pre-proposal"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue May 21 2002 - 10:21:36 EDT