From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Jan 13 2004 - 12:26:55 EST
Bert Kemner wrote:
> I've a problem with a Javascript form on a german website.
> (http://informationservices.swets.de/web/show/id=47553)
My IE browser says that this page is in UTF-8. Therefore, you can expect to get the form data back
to the server in UTF-8 as well.
> The input of the form contains german characters.
> But the output (which is generated by submitting the form) does not
> display those characters (see example beneath). My first reaction to
> this problem is that Unicode somehow does not translate these german
> characters to Windows (Outlook).
As Doug said, "Unicode" does not translate text. What translates text here is most likely your web
server. Your server appears to think that the form data should be encoded according to something
like ISO-8859-1, but the data is actually encoded in UTF-8.
The trick is to find out how to get your web server to assume the same encoding/charset for form
data returned from the browser as it uses to encode and send the original page to the browser. If
you use UTF-8 for the page encoding, then you need to use UTF-8 to go from the form data byte stream
to Java strings.
Hint: There are Java String constructors and other methods that turn a byte array into a String. If
those methods do not provide any way to specify the encoding/charset, then they probably assume
ISO-8859-1. Use instead a method that takes an encoding parameter. You may need to use a variation
of an InputStreamReader constructed for the "UTF8" encoding.
See also http://www.unicode.org/faq/unicode_web.html
Viel Glück,
markus
This archive was generated by hypermail 2.1.5 : Tue Jan 13 2004 - 13:06:35 EST