Re: (SC2WG2.609) New contribution N2705

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Feb 19 2004 - 04:47:08 EST

Next message: Jon Hanna: "Re: inconsistent behaviour in windows"

Previous message: Michael Everson: "RE: Fwd: Re: (SC2WG2.609) New contribution N2705"
In reply to: Rick McGowan: "Re: (SC2WG2.609) New contribution N2705"
Next in thread: Kenneth Whistler: "Re: (SC2WG2.609) New contribution N2705"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I have the same feeling, notably because the exposed documents are meant to be
fonted to have its notations readable and consistent.

And most probably because it creates new irrelevant character distinctions
within rich-text formats (SGML, HTML, ...) to manage these characters as well as
other occurences coded with markup in order to produce consistent output (here
of subscripts).

So suppose we code some subscripts used by Indo-Europeanist, and not some
others. How will a rendered document look like if some occurences are coded with
new separate characters, and oters coded with markup and standard characters ?

Suppose now that such text is to be generated/converted into plain-text. Some
occurences will be left unmarked, and some others may be left with the new
characters. There will be additional difficulties to insert a consistent
additional notation in the plain-text format to convert both categories of
subscripts. If this notation is not explained in the text itself, the document
would become unusable. But even if a conversion system is adopted, there will be
problems to have it produce consistent results throughout the text for all
occurences of either separate subscript letters and of standard characters with
subscript markup.

I much prefer to keep the encoding conservative, only to handle the case of
bijective mappings from important legacy (non-Unicode) charsets in which they
were introduced in the early times where rich-text formats were not easily
interoperable and plain-text was the only solution.

Today we have lots of way to create easily interoperable rich-text documents
(HTML, SGML, XML, DocBook, PDF, RTF, Word docs, ...) without needing such
pollution of Unicode.

Also I doubt they were ever used in a legacy interoperable charset encoding.
Authors will tend to use one of the rich-text formats where subscripts are easy
to produce from almost all existing characters.

----- Original Message -----
From: "Rick McGowan" <rick@unicode.org>
To: <unicode@unicode.org>
Sent: Thursday, February 19, 2004 2:45 AM
Subject: Re: (SC2WG2.609) New contribution N2705

> As long as we're on the topic, I have to weigh in on the conservative side
> in this argument, with Ken Whistler. Use of the existing subscript
> characters is generally bad practice. Adding more subscripts would be
> adding to the bad practice, and yield even more different ways to express
> the same thing (markup versus direct encoding).

Next message: Jon Hanna: "Re: inconsistent behaviour in windows"
Previous message: Michael Everson: "RE: Fwd: Re: (SC2WG2.609) New contribution N2705"
In reply to: Rick McGowan: "Re: (SC2WG2.609) New contribution N2705"
Next in thread: Kenneth Whistler: "Re: (SC2WG2.609) New contribution N2705"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Feb 19 2004 - 05:40:53 EST