Re: HTML anchors in UTF-8

From: Roman Czyborra (czyborra@cs.tu-berlin.de)
Date: Thu Sep 10 1998 - 11:28:13 EDT


> I recently converted my HTML-based Web dictionaries into UTF-8 in
> order to have greater multilingual functionality, but I now have a
> problem. Many of the terms in my dictionaries are linked by HTTP
> anchors (#). These work fine on my local machine in NT4.0 and
> Win95, but cease to be operative after I upload the material on to
> the Unix server.

Please provide the URL of a sample page on the Unix server and the
name of an anchor that is not "operative". Also specify exactly which
browsers, versions, platforms, have the problem and whether you used a
binary-transparent upload method.

> This has forced me to reinstall the old JIS versions of my
> dictionaries for the time being. Has anyone heard of this bug, or
> know a way around it?

Yes, been there, with Latin1 accents in filenames and mail addresses.
It is wise to use US-ASCII addresses / URLs as specified in RFC 1738
<ftp://ftp.isi.edu/in-notes/rfc1738.txt>. It sounds like you fell
victim to some internal charset conversion applied to the HTML source.

There is a text about the various UTF formats and their problems at
http://czyborra.com/utf/ but it still needs some polishing.

Cheers, Roman http://czyborra.com/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT