Sorry ... there is a mistake in my email.
I am setting content type on the response object not the request object.
response.setContentType("text/html; charset=UTF-8");
I keep on discovering more and more classes which have this insidious
assumption of 8859-1. If anyone knows how to override this default
encoding in a global fashion that would be great to know. Apparently
there is a command line parameter for Websphere:
-Ddefault.client.encoding=UTF8
but this does not appear to be universal for all J2EE application
servers.
Thanks in advance,
Paul
Plumtree Software
paul.deuter@plumtree.com
-----Original Message-----
From: Paul Deuter
Sent: Monday, July 16, 2001 10:00 PM
To: Unicode List (E-mail)
Subject: How to create an all UTF-8 Web site using Java (JSP)
How do I create a pure UTF-8 web site? Specifically is there a way to
change the standard servlet class to use UTF-8 as the default char
encoding instead of ISO 8859-1?
I have looked at the source code for Jakarta Tomcat 3.2.2 and noticed
the statement:
public static final String DEFAULT_CHAR_ENCODING = "8859-1"; (in
constants.java)
The various classes such as HttpServletRequest and HttpServletResponse
use this constant when creating the default readers and writers and as a
consequence, the web site ends up being Latin-1.
I have experimented with adding the line:
request.setContentType("text/html; charset="UTF-8");
This change does correctly change the encoding of the request object to
UTF-8 and subsequent output gets sent to the browser in UTF-8. However
the response object incorrectly interprets response data because it is
decoding %XX octets as Latin-1 instead of UTF-8.
I know there is special code that I can write such as
String param = request.getParameter("parameter1");
byte[] rawVal = param.getBytes("UTF-8")
//create new string again.
param = new String(rawVal);
However I would prefer not to have to write special code to re-interpret
data after the fact. Also there are other standard classes which also
seem to assume iSO-8859-1 as the default character set (such as
URLDecoder an URLEncoder). Since internal data will always be Unicode,
I would prefer to set the default encoding to UTF-8 and be able to write
standard Java code.
Is there an easy way to override the default encoding at a low level so
that all the classes that use the default encoding will just work?
Thanks,
Paul Deuter
Plumtree Software
paul.deuter@plumtree.com
Paul Deuter
Internationalization Manager
Plumtree Software
paul.deuter@plumtree.com
This archive was generated by hypermail 2.1.2 : Tue Jul 17 2001 - 14:48:25 EDT