Re: Ellipsis

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Tue Jan 17 2006 - 17:28:07 CST

  • Next message: Asmus Freytag: "Re: Ellipsis"

    On Tue, 17 Jan 2006, Mark E. Shoulson wrote:

    > Is the HORIZONTAL ELLIPSIS character, U+2026, to be preferred over using
    > three periods (possibly spaced with non-breaking spaces)? Or is it only
    > there for backward compatibility?

    We recently had a discussion about compatibility characters in general,
    and it still seems to me that the position in the Unicode standards is
    that compatibility characters be generally avoided, except for legacy
    data and other special occasions. We may disagree on what constitutes a
    special occasion, but it can hardly mean "whenever you like". And surely
    the general principle can be explicitly overruled in the standard for some
    characters. As far as I can see, there is no such rule for the HORIZONTAL
    ELLIPSIS.

    On the other hand, the special role of the HORIZONTAL ELLIPSIS is
    difficult to replace by other means. It seems unnecessarily clumsy to add
    specific character spacing in word processing whenever you use "...",
    and it is also clumsy to do similar things using markup and a style sheet.
    (Using <span class="ellipsis">...</span> instead of &#8230; or the
    HORIZONTAL ELLIPSIS as such doesn't sound very natural.)

    Using no-break spaces or normal spaces would not be a good idea, since
    it would mostly create far too much spacing. And attempts to reduce the
    spacing would be pointless, since you could just as well use "..." and
    increased spacing.

    > Obviously, "what is preferable" depends on what the situation is, but even so
    > there can be a sensible answer, e.g. "f + i" (or "f + ZWJ + i") is preferable
    > to U+FB01, LATIN SMALL LIGATURE FI, right?

    I think most of us agree with that, except for special situations, for
    varying meanings of "special situations". On the practical side, if you
    wish to use the fi ligature for typographic reasons, using U+FB01 is often
    the only feasible way.

    When considering the use of the fi ligature, for example, we should
    estimate the meaning of various pros and cons. For example, if I were
    preparing a document to be printed and distributed and wanted f and i to
    be ligated, I could rather safely use U+FB01 after checking that the
    software used will handle it. But if the document would distributed in
    digital format, problems would arise. For example, if the document
    contained the string http://www.suomi.fi written using U+FB01 and someone
    copied and pasted it into a web browser, the address would not work, of
    course (unless the copy & paste implementation did something relatively
    elaborated like normalization, which would be somewhat unexpected).

    > Is HORIZONTAL ELLIPSIS of this class?

    It's in the class of compatibility decomposable characters, but that class
    contains many different types of characters. If you ask me, HORIZONTAL
    ELLIPSIS should have been defined as an independent non-compatibility
    character. It's too late to change that, but it might be given a different
    status without changing the formal definitions.

    Interestingly, the Chicago Manual of Style describes the use of ellipsis
    points (without identifying characters by Unicode numbers) so that English
    and many other languages use "spaced" points whereas some languages use
    "unspaced" points. In Unicode terms, this seems to be a difference between
    HORIZONTAL ELLIPSIS and a sequence of three copies of FULL STOP, though of
    course it could also be described as typographical only, to be handled
    above the character level.

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Tue Jan 17 2006 - 17:34:20 CST