Re: UTF8 vs AL32UTF8

From: Roozbeh Pournader (roozbeh@sharif.edu)
Date: Fri Jun 08 2001 - 17:28:27 EDT


On Fri, 8 Jun 2001, Carl W. Brown wrote:

> Looking at your documentation you call UTF-8s UTF8 and standard UTF-8
> AL31UTF8. To me this is very misleading.

Also to me. I just sent a note to my colleagues next room (who use Oracle
for web applications) that they should be aware that Oracle's UTF8 is not
real UTF-8 and they should use AL31UTF8 instead, when applicable.

There are many people who don't know about the internals. They want their
application to work with UTF-8, they will find the Oracle datatype (or
what you call it) named UTF8, and they will stick with it. Few of them
have read the Unicode Standard (or RFC 2279, or ISO 10646) or participate
here to know what is four byte UTF-8 and what is six byte UTF-8. When they
find the problem after all the headache, they will blame UTF-8 (for being
an ambiguous encoding), I guess. What no one here likes.

I would like to ask Oracle to fix the datatype names, or at least add very
explicit comments to the definitions (everywhere UTF8 is mentioned), that
its UTF8 is not the real UTF-8, and mention what they should use instead.

--roozbeh



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT