Thanks for the response Markus.
Although a configurable option is a possible solution, we know that the
typical user (representing around 95-98% of users) never changes
defaults in a program, especially something as obscure as encoding
options. As you may know it is very popular to attack Microsoft for "UI
bloat", and this would no doubt add to that IMHO. But assuming we have
options, "which one do you default to?" is the $64000 question.
If you did have options, you could label the options you list as:
a) compatible with 1997 browsers and later
b) compatible with 1997 browsers and later
c) modify contents of document to be readable in all browsers.
Warning: some contents may appear different from your original document
Now, if your competitor offered this option:
d) Compatible with all browsers used _in your company_
you would have a hard time competing. (Note the emphasis on "in your
company" in the fourth option, meaning the customer's company. You could
even go on to say "most browsers on the Internet", but that got me in
trouble last time :-))
Erik raised an option of writing the actual byte value of the characters
in the file. It was my understanding that this can cause trouble in some
Unix servers that are not expecting byte vales in the 0x80-0x9F range.
Can someone comment here?
Chris
-----Original Message-----
From: Unicode Discussion [SMTP:unicode@unicode.org]
Sent: Monday, July 07, 1997 6:47 PM
To: Multiple Recipients of
Subject: Re: Usage of CP1252 characters on www.msnbc.com
Chris Pratley wrote on 1997-07-08 00:29 UTC:
> Do you (or anyone else), have some suggestions on this issue?
I think it
> is a hard problem to solve, and I was trying to get a sense of
what
> solutions people were adopting.
In the Unix world, in such situations we make things
configurable. I am
not familiar with the various Microsoft products that produce
HTML files,
but I would expect quality software to allow me to switch
between the
following alternatives when I convert a CP1252 based file into
HTML
in some export filter:
- convert to Unicode NCR
- convert to Unicode UTF-8
- transscribe down to ISO 8859-1 (i.e. replace `smartquotes'
by quotes)
and if it is really necessary for the existing installation
base, then
I might also offer the following together with a warning in big
red
blinking letters that it will break non-Windows systems:
- output directly in CP1252 bytes (not NCR!) and make sure
that the
IANA registry contains a reasonable MIME entry for CP1252
and that
the HTTP server will announce CP1252 as the encoding
I fully understand that Microsoft is not alone guilty and that
Netscape created the same mess even before. [But making new
errors
is always slightly more honorable than repeating old ones ...
;-]
However, as you do, I also hope that Unicode support with at
least the CP1252
characters (better even MES or more) will in the next 12 months
become
so widely implemented that backwards compatibility of the last
option
will not any more be that important and that then the first two
options
above become the widely accepted default choices.
BTW: I just got a reply from MSNBC on my letter:
Thank you for writing. The lack or change of punctuation you
describe
in viewing our site with Netscape is due to the way our web
editor sees
HTML code. Without getting technical, we have had to
substitute
standard HTML code that represents the apostrophes and other
punctuation
marks with a slightly different version of the code. This is
definitely
a bug in our web editor, and we are working hard on a
permanent fix.
Your patience in this matter is appreciated.
MSNBC Customer Support
Markus
--
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn@cs.purdue.edu
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT