Re: Unicode in web pages

From: Stephen Toner (toners5@hotmail.com)
Date: Mon Sep 04 2000 - 08:28:25 EDT


The character is posted in a form, and the recieving page opens a connection
to a SQL Server 7.0 database using the Weblogic JDBC:ODBC driver which
supports unicode. The java sting is then passed to the database.

I have now found that the symbols in the database where indeed the UTF-8
version of the characters eg ็=รง. This was for some European characters
only.
However many characters in languages such as Japanese (and the Euro symbol)
reach the database not in their correct form but with question marks in
them. I don't know where the problem is occuring. How does the character
get converted into these UTf-8 sequences, and could there be a problem with
this - possibly it doesn't recognise the character that it should be
converting (Just a mad stab in the dark)

Because UTF-8 is a sequence of bytes, does that mean that it could be
treated and stored as ASCII, and that the sequence would be recombined to
unicode on output if the encoding was set to UTF-8?

>From: "Michael \(michka\) Kaplan" <michka@trigeminal.com>
>To: "Unicode List" <unicode@unicode.org>
>Subject: Re: Unicode in web pages
>Date: Mon, 4 Sep 2000 05:04:08 -0800 (GMT-0800)
>
>Well, the client side is right if you are using UTF-8 and the browser does
>indeed show UTF-8 as the encoding being used (how to check this depends on
>your browser -- View|Encoding or Edirt|Preferences), so there must be some
>issue on the server side.
>
>You may need to post more detail on the database, how you are getting to
>it,
>etc. so someone who knows more about the server config can comment.
>
>michka
>
>
>----- Original Message -----
>From: "Stephen Toner" <toners5@hotmail.com>
>To: <michka@trigeminal.com>; <unicode@unicode.org>
>Sent: Monday, September 04, 2000 7:12 AM
>Subject: Re: Unicode in web pages
>
>
> > I am using JSP on the server side, and am using the TomCat server.
> >
> >
> > >From: "Michael \(michka\) Kaplan" <michka@trigeminal.com>
> > >Reply-To: "Michael \(michka\) Kaplan" <michka@trigeminal.com>
> > >To: "Stephen Toner" <toners5@hotmail.com>, "Unicode List"
> > ><unicode@unicode.org>
> > >Subject: Re: Unicode in web pages
> > >Date: Mon, 4 Sep 2000 04:57:18 -0700
> > >
> > >UTF-8 is indeed the characterset you want to use for the page encoding;
> > >although some browsers will support UTF-16, etc., not all will.
> > >
> > >But the real issue has to do with what technology you are using to
>connect
> > >to the db. Is it ASP on the server side? Or something else? And what is
>the
> > >server?
> > >
> > >michka
> > >
> > >
> > >----- Original Message -----
> > >From: "Stephen Toner" <toners5@hotmail.com>
> > >To: "Unicode List" <unicode@unicode.org>
> > >Sent: Monday, September 04, 2000 4:21 AM
> > >Subject: Unicode in web pages
> > >
> > >
> > > > Hi,
> > > > I'm fairly new to unicode and have a few problems trying to input it
> > >from
> > >a
> > > > brower.
> > > > I need to take input from a web-page, and store it in a database.
>Web
> > >pages
> > > > are then driven from this database. We want to use unicode to allow
> > > > multi-lingual support. I was wondering if anyone could tell me of
>any
> > > > issues likely to be faced in this process.
> > > > Our database is capable of storing unicode, but I'm not sure if what
>is
> > > > reaching the database is actually unicode. Using IE 5.5, a textarea
>in
>a
> > > > form is submitted containing any entered text. I have tried
>specifying
> > >the
> > > > page's character set as UTF-8. What then reaches the database is a
> > >series
> > > > of ASCII values with foreign characters such as Japanese, or
>accented
> > > > characters, converted to a few symbols. I don't know if this is
> > >unicode,
> > > > where when I look at it in the database the multi-byte characters
>can
>be
> > > > seen as a combination of single byte (gibberish) characters.
> > > > If this isn't unicode do I need to put in some sort of converter to
> > >change
> > > > to &#xxxx; format? Some web sites seem to say that for html,
>unicode
> > >must
> > > > be changed to this numeric character reference format.
> > > > I would appreciate any advice.
> > > > Thanks in advance,
> > > > Stephen
> > > >
> >
> >_________________________________________________________________________
> > > > Get Your Private, Free E-mail from MSN Hotmail at
> > >http://www.hotmail.com.
> > > >
> > > > Share information about yourself, create your own public profile at
> > > > http://profiles.msn.com.
> > > >
> > > >
> > >
> >
> >
>_________________________________________________________________________
> > Get Your Private, Free E-mail from MSN Hotmail at
>http://www.hotmail.com.
> >
> > Share information about yourself, create your own public profile at
> > http://profiles.msn.com.
> >
> >
>

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Share information about yourself, create your own public profile at
http://profiles.msn.com.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT