From: Dennis Heuer (dh@triple-media.com)
Date: Mon Apr 13 2009 - 23:57:41 CDT
On Mon, 13 Apr 2009 19:51:07 -0600
"Doug Ewell" <doug@ewellic.org> wrote:
> Try not to think in terms of "keys" when talking about the things
> encoded in Unicode or any other coded character set. The Escape key is
> not always associated with the character U+001B. Its use is dependent
> on operating system and application. In most Windows apps, it does not
> generate a character at all.
try to understand the term key ;) we're not talking about symbols or
glyphs here. though these are the visible descriptors on a keyboard, the
board is called key-board and not symbol-board or button-board. if you
think more deeply about this fact, you'll see that i used the correct
term. keys are things held for or providing access to something else,
which should specified somehow more or less. time might show that users
don't care and misuse a key for something else (not always in the worst
sense.) however, a key stays a key. in other words: unicode characer
codes are keys to an intended meaning, visually represented (if
possible) by the image that expresses (hopefully) best this meaning.
this meaning may be reasonable or not, too global, too strict, and
sometimes conflicting too. however, this is why talking about the
intend behind some certain defined keys is valuable.
> Providing a mechanism to switch from one character set to another is the
> job of ISO 2022, not of any individual character set. Try reading it
> again. Give yourself a bit of time.
read on to understand my critique...
On Mon, 13 Apr 2009 22:12:27 -0400
David Starner <prosfilaes@gmail.com> wrote:
> Gag me with a spoon.
i really think that this describes best your answers. they are
traditional, loose, and they are also ignorant. don't know how
to answer best. for example:
> Yes, because that's the tool that does the job.
haehh? the escape key wasn't invented for this but used up for this. it
is used extremely differently and there are many coding conventions,
which already answers why your named standards never got widespread.
they only behave well in environments agreeing on them. escape sequences
are messy, and one can't tell from a text file which convention is
behind an escape sequence. hence, was it ISO-2022 or a different one?
none is dominant to a simple text editor. a commandline editor would
rather interpret all codes as ANSI-codes. yes, in some cases one can
say that it can't be this convention. but does one know by that which
it was? this is, for example, why unicode actually has included separate
characters for ambiguous cases, like for 'newline', only to name one.
one thing i must state very clearly: ISO-2022 is not made for simple
text, it is made for environments where an agreement can be reached.
gnomepad has no idea of that. what is the use of ISO-2022 to gnomepad
if gnomepad can't detect it seriously (always safe) from a file it
reads? i think that you stick to conveniences (quick answers) without
seeing that the combination of unicode and ISO-2022 may work but not in
all cases. rethink the 'if' in the earlier posted paragraph from the
ISO-10646 standard. there is a reason for this 'if'. this reason is
crucial to the failure of ISO-2022 in reamls other than the initially
intended.
and, escape sequences do not neccessarily show up as junk. the shell
knows configuration for this, for example. on some systems, escape
sequences can even make the computer unresponsible (terminals, for
example). they can be quite dangerous because different systems
interpret them differently. there is no single approach to it. calling
this a "very lightweight form of error message for lightweight program"
provokes horror visions of how your system works.
just read this text from the sysctl manpage:
-N Use this option to only print the names. It may be useful with
shells that have programmable completion.
this is not about escape sequences but shows how things can blow up
just by printing content otherwise seen as harmless. so what about
accidentially sending escape sequences to an esc-sensitive shell with
cat or sed? the point is that the esc semantics *are* problematic and,
thus, many tools provide options to filter them out or to convert them
into \x.., for example.
if i were writing to the POSIX standards body, OK! however, ISO-10646
was invented to solve problems instead of carrying them out with
"lightweight" propositions.
to explain my aggressive style: i'm fully shocked by your answers! do
you think that you have treated this subject sensible?
> What you haven't shown is that it's better to handle
> the changing character sets using Unicode-level characters, instead of
> a higher level protocol.
yes, i did. what the heck html has to do in plain text files. there is
no generally agreed on text format except of plain text. today, we
could talk about OpenDocument. but would you store your system config
in odt? would your store your programming scripts in odt? would you
store your emails in odt? ...
> Of course the solution to "many" is to add another one.
why? did you mean my saying? solving things on the font level otherwise
done on a meta-level is not 'adding another one'--it is a different
approach! consider that high-level standards are rather 'explaining'
content. just setting a line to bold is not 'explaining'. it is simple
typesetting. that this is done via high-level formats is rather a
matter of no alternative. however, high-level formats are fully
oversized for the task. telling that the most common uses of a font
logically enforce complex and not agreed on higher-level meta-formats
is weird. it is systemic nonsense. it also doesn't explain all the
spacing and punctuation characters. it doesn't explain the dingbats,
etc. they are not commonly used in plain, floating text but rather in
more typographic enhanced text like on cards or in flyers.
> At the least, extended character sets are notoriously hard
> to input; people that traditionally used Latin-1 still often use --
> emdashes and ' and " quotes in Unicode texts, because the correct
> characters are hard to type. A bunch of new formatting characters are
> going to be a pain to input and horrible to edit in systems that don't
> have a "show formatting characters option".
> Most new formatting
> systems are designed to be easy to enter from standard keyboards,
hence, unicode is a flop? was unicode invented to fulfill the criterias
described by yours?
> while your system needs a whole new UI for entering characters.
why? text-processing systems already have buttons for entering those
codes. only, at the moment they enter meta-data into their proprietary
text-file formats...
because all of the needed can be configured via one toolbar and one
generalized dialog, of which there are many impelemtation examples (and
even most HTML input systems for web-sites have those toolbars), there
is not really a reason to let heaven fall down on my arguments
> If you really need italics and bold in quote-unquote plain text,
> there's an ISO standard of escape sequences for it. Again, it's been
> there for decades; if everyone cared to support it, it would work just
> fine. If there really was enough demand for this, people would have
> supported it or something like it.
read from above again and try to understand that ISO-2022 is not an
example of what people want.
> > this means that there are keys for most used sizes and steps
> > like +1pt, +2pt, +5pt, and +10pt.
>
> Gag me with a spoon. Yay, a formatting system that supports a limited
> number of font sizes, so for any application that currently supports
> half-decent typography has to continue carrying around formatting
> codes for font sizes, that will interact in weird and annoying ways
> with your Unicode-based formatting codes.
again, why? if you want +3pt, you can combine +1pt and +2pt. however,
steps are rather neccessary for having more visible headers and such.
i'm still talking about plain text files. these don't support <H1> and
are not the base of typographic setting systems! I talk about daily
stuff being better readable and representable. so, at least, some
'hints' like 'header_start' and 'header_stop' are quite usable. one can
put them beside the brackets ;))
> The common denominator for typography is pretty low.
not for graphical text editors, for email and other messaging systems,
for wikis, blogs and ... ah ... so many ... also, before they switch to
html they rather support new uni-codes.
i don't feel taken for serious on this list. seems that you just fly
through and answer back some quick 'against', only having one image of a
possible target in mind. interestingly the images you created in your
answers never fit to the images i very verbosely tried to create. i'm
talking about plain text and you answer to typo systems or general
web standards. haehh?? was unicode invented to support mass-user,
heavy-weight, proprietary multimedia-thingies or was it invented to
help in daily text writing? i'm confused. is it a web-standard?
think carefully if you really want to discuss my proposal. possibly we
better cut this thread. i lost hope.
regards,
dennis
This archive was generated by hypermail 2.1.5 : Tue Apr 14 2009 - 00:00:14 CDT