From: Martin Duerst (duerst@w3.org)
Date: Sun Dec 07 2003 - 11:27:15 EST
At 23:34 03/12/07 +0900, Jungshik Shin wrote:
>On Sun, 7 Dec 2003, Peter Jacobi wrote:
> > There is some mixup of lang and encoding tagging, which I didn't fully
> > understand.
>
> When lang is not explicitly specified, Mozilla resorts to 'infering'
>'langGroup' ('script (group)' would have been a better term) from
>the page encoding. Because UTF-8 is script-neutral, it's important to
>specify 'lang' explicitly. Your page is in ISO-8859-1 so that without
>lang specified, it's assumed to be in 'x-western' lagnGroup(well, Latin
>script). Anyway, this behavior slightly changed recently in Windows
>version (I forgot when I commited that patch, before or after 1.4)
>and each Unicode block is assigned the default 'script'. The way fonts
>are picked up by the Xft version of Mozilla makes it harder to do the
>equivalent on Linux.
I know that font selection/composition is a terribly difficult
business, and hard work, so improving things takes time.
Starting out with certain assumptions about fonts for certain
encodings is clearly very helpful for speed. But I think that
not (correctly) rendering a character that is obviously in
one script and not in another is a bad idea.
Years ago, I developed a very flexible system that was able to
start out with the user-selected font but would use another
font if the first font wasn't able to do the job. The basic
architecture was in many ways very simple, but it took quite
some time to get it right. Once I had this basic architecture,
all kinds of neat things became very easy. For details, see
the paper from the 7th Unicode Conference at:
http://www.ifi.unizh.ch/groups/mml/people/mduerst/papers/PS/FontComposition.
ps.gz
Regards, Martin.
This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 13:35:32 EST