Re: Unicode Trouble with Netscape for Unix

From: Markus G. Kuhn (kuhn@cs.purdue.edu)
Date: Sun Jul 20 1997 - 16:48:42 EDT


Keld J|rn Simonsen wrote on 1997-07-20 13:19 UTC:
> I tried it, but it only gets me fonts corresponding roughly to
> cp1252.

I also just tried Netscape's
and I was not able to display for instance all characters in

  http://www.kostis.net/charsets/cp437.html

which lists the old MS-DOS charset as NCRs and declares the character set
to be UTF-8 (although the HTML file contains only ASCII bytes).

The bahaviour of Communicator 4.01b6 is pretty strange: If you load the
above demo file the first time, then the normal Latin-1 default font is
used. If you then play around with the encoding selection menue, then
Netscape (independently of what you actually select) seems to discover
that there are Unicode characters in this file and Netscape switches
to some builtin Unicode subset font described in the preferences menu
as "times (NSPseudoFonts)".

I have on my system installed the fonts

  -etl-fixed-medium-r-normal--14-140-72-72-c-70-iso10646-1
  -etl-fixed-medium-r-normal--16-160-72-72-c-80-iso10646-1
  -etl-fixed-medium-r-normal--24-240-72-72-c-120-iso10646-1

but Netscape does not allow me to use one of those for the Unicode
encoding, as it seems to expect the registry/encoding unicode-2-0
at the end of the X11 font names. This nonstandard notation with
two hyphens however breaks the X11 font naming format, i.e. I can't
get my ETL font recognized by renaming it to

  -etl-fixed-medium-r-normal--14-140-72-72-c-70-unicode-2-0

as this name is a syntax error.

I can of course chose the ETL font in the preferences menue as my font for
the User-Defined encoding, where I get offered it as "Fixed (Etl, iso10646-1)"
in the menu, but this doesn't help, because as soon as I try to switch
the encoding to "User-Defined" in the View|Encoding menue, Netscape
seems to discover that the encoding is Unicode and uses those
NSPseudoFonts instead.

Summary: I was unable to view a correct HTML file with ISO 10646-1 NCRs
using my own ISO 10646-1 font. The whole behaviour of the font selection
system felt very counter intuitive, especially that the selection of the
font is related to the encoding used by the HTML file does not make
much sense in a world where Unicode is understood to be the base character
set.

It would be nice if someone from Netscape could download the ETL fonts
from

  ftp://sizif.mf.uni-lj.si/pub/i18n/fonts/bdf/etl-unicode.tar.gz

install them on a Unix machine (see my message from yesterday), and try
to display the file

  http://www.kostis.net/charsets/cp437.html

in the ETL fonts such that all characters are visible. I do not think this
is possible with communicator-v401b6-export.sparc-sun-solaris2.4,
or if it is, then it is very difficult to figure out how to do it.

Markus

P.S.: I was finally able to view the file in my ETL font by downloading
it into my home directory and removing the line

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

that announced that this file came in UTF-8 encoding. Now, I was
able to select User-Defined Encoding, but to my big disappointment,
all non-ASCII characters in the displayed table where replaced by
question marks. With Netscape 3.0 in the same configuration, at least
the Latin-1 characters were still displayed. Very strange ...

-- 
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn@cs.purdue.edu



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT