RE: No Invisible Character - NBSP at the start of a word

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Dec 07 2004 - 05:43:32 CST

Next message: Philippe Verdy: "Re: Nicest UTF.. UTF-9, UTF-36, UTF-80, UTF-64, ..."

Previous message: Asmus Freytag: "Re: [hebrew] Re: proposals I wrote (and also, didn't write)"
In reply to: Jony Rosenne: "RE: No Invisible Character - NBSP at the start of a word"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 11:52 PM 12/6/2004, Jony Rosenne wrote:
>In chapter 8, regarding Hebrew, the standard says:
>
>Positioning. Marks may combine with vowels and other points, and there are
>complex typographic rules for positioning these combinations.
>
>I understand that this sentence should be regarded as being normative.

The aim of Unicode in making normative statements about layout is to allow
a user (an author) to confidently select the correct character for the
intended purpose at the correct location and (generally) rely on that
conforming implementations will render it in a way that's compatible with
the intent.

As long as that is the case, the actual rendering may be more or less
'pretty', according to how sophisticated the layout engine is. After all,
between a typewriterish rendition of a script and full-featured print there
can be a wide gulf.

Unicode was not intended to replace all typographical rules and customs, but
where there is a situation where it's doubtful to an author whether or not
a certain character should be placed before or after another character for
a given representation, that's the situation in which Unicode tends to clarify.

This process of clarification can be considered an ongoing process, since
in many cases the fact that there is a need for a clarification may not have
been apparent from the start in a given case.

It is certainly also true that some effects are best left to markup (or other
out of band information). This is certainly true for something as complex
as fully built-up mathematical equations. However, for layout effects in
running text, the presumption should be that if they are ordinary and can
be described in predictable ways that harmonize with other, similar and
already existing situations, that in those cases out of band information
should not be required.

Sometimes, but this does not apply to the case at hand, the same effect can
both be appropriately applied as markup and via special characters. I'm
thinking of the letterlike forms used in mathematics and phonetics, many of
which are identical in shape to what can be produced with markup. However.
for a variety of reasons it was felt that requiring markup in each instance
was too limiting.

Therefore, all of what you quote is eminently so, but the conclusion
doesn't follow. This is a case where UTC will have to come to (or sustain)
an explicit judgement, to settle the controversy.

A./

PS: What I have written about Unicode not intending primarily to be
prescriptive about the layout, but being interested rather in establishing
an agreement for both authors and programmers on which sequence of
characters is used to represent which construct - those statements are
related to the discussion of character identity and properties, most
closely developed in TR#23 so far. There might be a reason to add some
language clarifying these concepts further for 5.0. Suggestions are welcome.

Next message: Philippe Verdy: "Re: Nicest UTF.. UTF-9, UTF-36, UTF-80, UTF-64, ..."
Previous message: Asmus Freytag: "Re: [hebrew] Re: proposals I wrote (and also, didn't write)"
In reply to: Jony Rosenne: "RE: No Invisible Character - NBSP at the start of a word"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Dec 07 2004 - 16:21:39 CST