Re: AltaVista search in various languages

From: Martin J. Duerst (mduerst@ifi.unizh.ch)
Date: Thu Jul 17 1997 - 07:52:39 EDT


On Thu, 17 Jul 1997, Leong Kok Yong wrote:

> Recently, i found out that AltaVista can allows user to search in various
> languages. I was delighted and went straight to http://altavista.digital.com
>
> But when i type in double-byte Chinese GB or Big5 characters, it return
>
> ------------------------------------
> No documents match the query.
> The term does not appear in the index.
> You might want to check the spelling.
>
> It is not currently possible to search for multi-byte characters within a
> Chinese document.
> -----------------------------------
>
> Then why did they bother to introduce the double-byte languages in the
> first place when it's not working?!?!

There may be several problems:

- The browser supports GB/Big-5/... in output (i.e. on the page), but
        not in input fields (because there it's a lot more work). This
        applied to a lot of browsers; I don't know whether it has
        improved.

- There are some problems identifying the character encoding ("charset")
        a request (URL query part).

- There may be some problems with the search engine (it has to be
        changed to correctly work with GB/Big-5/...).

Even just having the ability to search for ASCII in Chinese documents
can be useful, although it is of course of limited value.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT