Re: "Missing character" glyph- example

From: Martin Kochanski (unicode@cardbox.net)
Date: Fri Aug 02 2002 - 04:19:25 EDT


Periphrasis is always possible, of course; but that doesn't mean that it is desirable.

1. Periphrasis is by definition longer. In a page where you want to present a lot of information and not have it squeezed out by meta-information, the first paragraph in my example could read "Seeing things like []? Click here". (I do agree with you that "click here" is more sensible than "download a font or...", but I just wanted to squeeze my example onto a single page instead of having to provide a target for the link).

>If you have trouble displaying any of the characters in the text on this page,
2. "Having trouble displaying" implies that the reader knows what the stuff he is displaying ought to look like. If you show me a page of Arabic, then as long as it looks all sort of squiggly I have no real way of knowing whether it is right or wrong. [If you can read Arabic, then substitute a script that you don't know; unless you are Michael Everson, in which case there is no such thing]. If you show me a page of text that I *can* read, and tell me to look for "trouble", I'm also lost unless you tell me what sort of trouble I am meant to be looking for ("characters" don't mean much to a naive user).
If you want a rubric that asks the user to "click here" for other kinds of display problem, you could phrase it a little differently (eg: "Seeing weird things like []? Click here for help"). I don't want to get into the detailed poetics of user interfaces: all I am seeking to establish is that being able to display "[]" can make for shorter and more direct messages.

>A. Avoids font-specific circularity in your attempt to explain...
3. "Font-specific circularity" is the **entire purpose** of this proposal. If you make an ostensive reference to something, then it helps if the reference looks the same as the thing that you are referring to.

>C. Doesn't depend on dubious assignments of a code point in
>Unicode for a confusing (non-)use.
I'm sorry, I don't understand what this could mean. But possibly it is not relevant to the rest of your argument?

4. Other people's posts have, I think, eliminated "U+0000" as a possibility, not least because it's not defined (in Unicode) as not being an ordinary printable character at all. I am no expert, but it seems to this innocent observer that glyph numbers and Unicode code points inhabit different universes with no necessary connection between them and that if, in a particular font, glyph LXV happens to correspond to code point U+0041, that is a cheerful fact about the font but not to be relied upon in general.

5. I *should* reiterate (because some people seem not to have noticed... this is the trouble with reading Courier email) that all existing fonts *do* already display the proposed new character correctly, so that no changes will be required for them to implement it.
Why, in that case, make a proposal at all?
(i) To make sure that whatever code point is decided upon does not suddenly receive a glyph in a new version of Unicode.
(ii) To allow sophisticated systems that distinguish "unassigned Unicode character" from "Unicode character that I happen not to be able to display" to display the latter glyph.

At 12:34 01/08/02 -0700, Kenneth Whistler wrote:
>> As a clarification, here is a sample web page:
>>
>> http://www.cardbox.com/missing.htm
>>
>> The requirement is to be able to display the first paragraph of the
>> page in such a way that it makes sense in its reference to the text
>> on the rest of the page.
>>
>> The character after the word "this:" in the first paragraph cannot
>> be reliably represented by any existing Unicode character.
>>
>> Nevertheless, I believe it is legitimate to want to say what the
>> first paragraph says.
>
>Well, I would put it differently, if it were my web page.
>Rather than:
>
><quote>
>If any of the following text contains characters such as this: {blort}
>then please change to a different font, or download a more recent
>version of your current font.
></quote>
>
>I would suggest something more along the line of:
>
><quote>
>If you have trouble displaying any of the characters in
>the text on this page, please consult <a href=xxx.html>
>Troubleshooting Display Problems</a>.
></quote>
>
>Then the troubleshooting page could provide a nice explanation
>of the problem, show several neatly formatted *graphics* of
>the kind of nondisplayable glyph issues (with alternate forms
>picked from various fonts) that a user might run into, and
>then give helpful links to actual font resources that would
>help, or in the case of specialized data, actually provide a
>usable font directly.
>
>Such an approach:
>
>A. Avoids font-specific circularity in your attempt to explain
>to a user what is going on when the display is broken.
>
>B. Provides much more useful information that will actually
>have a better chance of helping the user get by the problem.
>Also, since the problem(s) may not only be some nondisplayable
>glyphs, the approach is extensible for whatever display help
>is needed.
>
>C. Doesn't depend on dubious assignments of a code point in
>Unicode for a confusing (non-)use.
>
>But if you insist on having a code point to stick directly in
>a sentence like that above, I'd take the cue from James Kass:
>
>> The missing glyph is the first glyph in any font. This is mapped to
>> U+0000 and the system correctly substitutes the glyph mapped to
>> U+0000 any time a font being used lacks an outline for a called
>> character.
>
>Thus, you have a reasonably good chance that if you try to
>purposefully display the character U+0000, you will get the
>missing glyph for the font in use. (Unless the application is
>filtering out NULL characters.)
>
>--Ken
>
>
>
>
>
>



This archive was generated by hypermail 2.1.2 : Fri Aug 02 2002 - 02:43:35 EDT