RE: Backslash n [OT] was Line Separator and Paragraph Separator

From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Wed Oct 22 2003 - 06:19:14 CST


John Cowan wrote:
> XML 1.1 will treat CR, LF, NEL, <CR, LF>, <CR, NEL>, and LS as line
> terminators and report them all as LF. PS is left alone, because of
> the bare possibility that it is being used as quasi-markup.

I'm not sure why <CR, NEL> should be seen as a single line end.

And I think PS should be seen as a line end for XML too.
It, like LS, can be used to format the XML source, but should not
be interpreted as other than line end when parsing the XML source.
E.g., PS is not a begin-end markup, which all other XML markup is;
nor do I know of a way of attaching "style" to a PS, like can be done
for <p></p> etc.

Following (ex-) UAX 14 fully, FF and VT should be seen as line
separtors too. Though they are unlikely in XML source files.
FF shouldn't be interpreted as generating a page break in the
"styled output" of an XML file, should it?

> I can't imagine why EOF should be called a line terminator, except
> in the sense that a "read a line" operation should obviously
> not attempt to read past EOF.

There have been Unix programs that (mistakenly, I'd say) *discarded*
the last (possibly partial) line of input, just because it had no LF at
its
end... And LS it's a separator, not a terminator, so EOF has to be a
line
terminator.

> Calling it a line terminator means that every
> document is forced into the mold of being an integral number of lines
> long, regardless of the facts.

?? If you mean that concatenating files should not generate a line break
between the files, I agree.

                /kent k





This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST