From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Jan 23 2007 - 05:05:16 CST
On 1/23/2007 1:05 AM, Jukka K. Korpela wrote:
> On Mon, 22 Jan 2007, Doug Ewell wrote:
>
>> I always thought the convention of using a double hyphen to indicate
>> line-splitting hyphenation at a point where lexeme-joining
>> hyphenation would have occurred anyway was a simply brilliant idea,
>> one I wish were in more widespread use.
>
> It would indeed be useful to make such a distinction, at the character
> level, at the glyph level, or both. In text processing, it would be
> relevant to know whether a word (or other expression) actually
> contains a hyphen or there's just a hyphen at the end of a line to
> indicate continuation of the word on the next line.
>
> I'm biting my tongue to avoid saying that the soft hyphen character
> was, at least in some people's interpretations, meant to act as
> line-splitting hyphenation character but then turned into a
> discretionary hyphen.
If you add non-obvious features to a standard, you have to make sure
that information on how to use them is widely disseminated. Early 8-bit
standards had to be ordered as paper copies which meant that what most
people referenced were re-creations of just the character layout part.
Early standards also tended to not contain a lot of information on how
characters were to be used, or explicitly allowed competing usage
conventions (for control codes).
Given how few characters are in each 8-bit standard, the number of
characters where there is an associated uncertainty of what they were
meant to encode (and this includes some printable characters as well)
has proven astonishingly large....
>
> Anyway, Unicode is about characters that are used, rather than
> characters that should be used. On the other hand, this is a chicken
> and egg problem these days. When most texts are written using
> computers and appear in digital form, thereby inevitably using encoded
> characters, there is little room for introducing new characters.
As long as we are clear that a double hyphen is not intended to be used
when the font style requires a doubled shape for the standard hyphen
(and we have now sorted that out, finally), there's no longer a reason
to prevent people from making an explicit distinction. Whether that kind
of distinction will ever become mainstream, I don't know, but for my
case, I'm now convinced that there are enough special needs for one of
these things that it's time to add it.
I'm firmly opposed to the idea that the main purpose here is to encode a
specific semantic. That would have be done by the rules of the
orthography in which this character is used. In other words,
CONTINUATION HYPHEN would be inappropriate as a character name.
The character code should simply serve two purposes:
1) allow a distinction between the new character and standard hyphen
2) request a double stroke glyph
We should also establish (more clearly than we have, perhaps in the
past) that font designs that use a slanted form for the hyphen do not
need to encode a separate character for a short slanted dash, but use
the character code for hyphen; font designs that use a slanted form for
the hyphen, should use a slanted form for the double hyphen; and
finally, some fonts, such as Fraktur, will use a slanted double stroke
form for the standard hyphen.
That is the appropriate set of glyph variations for standard
(non-decorative) fonts. It will be obvious, that with a Fraktur font you
can neither support the double hyphen, nor the oblique double hyphen
*characters*, as they would not be rendered with any distinction. That's
fine, and is not a requirement.
Similar, font styles that use slanted forms for any hyphen (double and
single) cannot be used for those notations that need the oblique double
hyphen (that's also fine).
However, I would expect a font that uses a standard hyphen glyph to
support a double hyphen with a standard double hyphen glyph, so that it
can be used most widely. (Apart from the fact that it will take a while
before a newly proposed character can be encoded and then much later
supported....)
This scheme is not so different from support for specialized
distinctions needed for IPA or mathematical use. Fonts intended for such
uses must accept a restriction of the glyph variation for certain common
characters, in order to retain a visual distinction with another, more
specialized character. Fonts for ordinary users are not so constrained
and can be more fanciful or varied in their glyph choices.
A./
This archive was generated by hypermail 2.1.5 : Tue Jan 23 2007 - 05:07:13 CST