Re: Fixed Width Spaces

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Apr 04 2004 - 10:55:42 EDT

  • Next message: Eric Muller: "Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)"

    ----- Original Message -----
    From: "Ernest Cline" <ernestcline@mindspring.com>
    To: "Philippe Verdy" <verdy_p@wanadoo.fr>
    Cc: "Unicode Mailing List" <unicode@unicode.org>
    Sent: Sunday, April 04, 2004 4:30 AM
    Subject: Re: Fixed Width Spaces

    >
    >
    >
    > > [Original Message]
    > > From: Philippe Verdy <verdy_p@wanadoo.fr>
    > >
    > > > There is at least one instance where NBSP had best be treated
    > > > as a fixed width space, when it is used as thousands separator as in
    > > > 100 000. Unicode recognizes it for this use by assigning NBSP the
    > > > Bidi Class of CS. I doubt if anyone is going to seriously argue that
    > the
    > > > space between 100 and 000 should be expanded upon justification.
    > > > Of course, that could be taken care of by adding NBSP to Boundary
    > > > class MidNum in the Text Boundaries document (UAX#29) without
    > > > affecting its nature when used elsewhere.
    > >
    > > Isn't that the role of the FIGURE SPACE, or better, of the THIN SPACE ?
    >
    > FIGURE SPACE main function is a place holder so that the lining up
    > of numeric data can be done easily is proportional plain text.
    >
    > PUNCTUATION SPACE can serve the same function for commas
    > and decimals that aren't present in some of the figures but not in all,
    > but it might also be appropriate for job of thousands separator in
    > general.
    >
    > THIN SPACE also might be appropriate for the job of thousands
    > separator.
    >
    > However, as far as the Bidirectional Algorithm is concerned,
    > NBSP is the one and only space that it recognizes as linking
    > adjacent groups of digits into a single number.

    Yes but NBSP cannot be used in most books or in some legal accounting documents,
    due to its too large minimum width which allows a digit to be inserted. In
    France, for some legal documents, grouping digits can be done with a space, but
    the width of that space must be thin enough to not allow inserting any
    additional digit in the middle. This is important for bank checks and money
    orders for example, and as well this thin space must not be breakable.

    The FIGURE SPACE was defined to be uncompressible and normally unbreakable from
    the surrounding digits. It's a good candidate except that it exactly allows
    digits to be inserted after printing a final document, and thus is not
    appropriate in some legal publications due to its width.

    So THIN SPACE is generally used (possibly in association with ZWJ), even if
    Unicode has more recently introduced the Narrow Non-breaking space, which works
    better than THIN SPACE and has the necessary properties (should not be expansed
    independantely from digits during line justification, but it can be contracted
    if needed to make a number fit within a line.

    Many legacy i18n libraries use NBSP for decimal grouping, some even use SPACE
    (but with lots of problems due to unexpected line breaks); NNBSP is certainly
    the best candidate for now to override the NBSP, if it's supported (the THIN
    SPACE is generally supported in almost all good layout&composition engines used
    by publishers and the printed press).

    For plain-text documents, NBSP is almost always used, and the transformation of
    NBSP to narrower spaces is part of the typesetting job, and is generally
    performed with automated tools and converters used by publishers. Most of the
    time, we don't need even THIN SPACE, and narrow NBSP, etc... in plain texts,
    simply because SPACE and NBSP are just enough to transport the semantic of the
    text and the possibility of line breaks.

    So all these extra spaces are, in my opinion, part of the rich-text options for
    typesetting, just because it is simpler to encode them within the plain-text
    instead of inserting verbose meta-tags. But I do know that ALMOST ALL newspapers
    REQUIRE that their typists to enter thin spaces appropriately: i.e. as digit
    grouping separators (thousands, phone numbers, prices, various legal or bank
    identifiers with fixed formats), or as extensions of a basic punctuation.



    This archive was generated by hypermail 2.1.5 : Sun Apr 04 2004 - 11:23:59 EDT