From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Apr 04 2004 - 10:55:42 EDT
----- Original Message -----
From: "Ernest Cline" <ernestcline@mindspring.com>
To: "Philippe Verdy" <verdy_p@wanadoo.fr>
Cc: "Unicode Mailing List" <unicode@unicode.org>
Sent: Sunday, April 04, 2004 4:30 AM
Subject: Re: Fixed Width Spaces
>
>
>
> > [Original Message]
> > From: Philippe Verdy <verdy_p@wanadoo.fr>
> >
> > > There is at least one instance where NBSP had best be treated
> > > as a fixed width space, when it is used as thousands separator as in
> > > 100 000. Unicode recognizes it for this use by assigning NBSP the
> > > Bidi Class of CS. I doubt if anyone is going to seriously argue that
> the
> > > space between 100 and 000 should be expanded upon justification.
> > > Of course, that could be taken care of by adding NBSP to Boundary
> > > class MidNum in the Text Boundaries document (UAX#29) without
> > > affecting its nature when used elsewhere.
> >
> > Isn't that the role of the FIGURE SPACE, or better, of the THIN SPACE ?
>
> FIGURE SPACE main function is a place holder so that the lining up
> of numeric data can be done easily is proportional plain text.
>
> PUNCTUATION SPACE can serve the same function for commas
> and decimals that aren't present in some of the figures but not in all,
> but it might also be appropriate for job of thousands separator in
> general.
>
> THIN SPACE also might be appropriate for the job of thousands
> separator.
>
> However, as far as the Bidirectional Algorithm is concerned,
> NBSP is the one and only space that it recognizes as linking
> adjacent groups of digits into a single number.
Yes but NBSP cannot be used in most books or in some legal accounting documents,
due to its too large minimum width which allows a digit to be inserted. In
France, for some legal documents, grouping digits can be done with a space, but
the width of that space must be thin enough to not allow inserting any
additional digit in the middle. This is important for bank checks and money
orders for example, and as well this thin space must not be breakable.
The FIGURE SPACE was defined to be uncompressible and normally unbreakable from
the surrounding digits. It's a good candidate except that it exactly allows
digits to be inserted after printing a final document, and thus is not
appropriate in some legal publications due to its width.
So THIN SPACE is generally used (possibly in association with ZWJ), even if
Unicode has more recently introduced the Narrow Non-breaking space, which works
better than THIN SPACE and has the necessary properties (should not be expansed
independantely from digits during line justification, but it can be contracted
if needed to make a number fit within a line.
Many legacy i18n libraries use NBSP for decimal grouping, some even use SPACE
(but with lots of problems due to unexpected line breaks); NNBSP is certainly
the best candidate for now to override the NBSP, if it's supported (the THIN
SPACE is generally supported in almost all good layout&composition engines used
by publishers and the printed press).
For plain-text documents, NBSP is almost always used, and the transformation of
NBSP to narrower spaces is part of the typesetting job, and is generally
performed with automated tools and converters used by publishers. Most of the
time, we don't need even THIN SPACE, and narrow NBSP, etc... in plain texts,
simply because SPACE and NBSP are just enough to transport the semantic of the
text and the possibility of line breaks.
So all these extra spaces are, in my opinion, part of the rich-text options for
typesetting, just because it is simpler to encode them within the plain-text
instead of inserting verbose meta-tags. But I do know that ALMOST ALL newspapers
REQUIRE that their typists to enter thin spaces appropriately: i.e. as digit
grouping separators (thousands, phone numbers, prices, various legal or bank
identifiers with fixed formats), or as extensions of a basic punctuation.
This archive was generated by hypermail 2.1.5 : Sun Apr 04 2004 - 11:23:59 EDT