Re: UTF-8 Corrigendum, new Glossary

From: G. Adam Stanislav (adam@whizkidtech.net)
Date: Thu Nov 30 2000 - 20:05:30 EST


On Thu, Nov 30, 2000 at 10:18:07AM -0800, Markus Scherer wrote:
>you are free to write and use a non-conformant implementation. just be aware of what that means... :-)
>markus

I guess it means I'm a non-conformist. :)

I am currently working on software that translates mark-up made in one
mark-up language (Ister) and translates it into another (HTML). It
uses UTF-8, and works as CGI, i.e., generates HTML dynamically on a web
server (see http://www.whizkidtech.net/ister/ for unfinished docs).

If the source (in Ister) uses illegal but decipherable UTF-8, my
software accepts it. Naturally, before it sends it out it transforms
it to perfectly legal UTF-8. The idea I should reject it is silly
(and, no, the "internal data" clause does not apply here: my software
accepts data from an external source). Rejecting it would mean
that if the web page designer used some design software that messed
up the UTF-8 encoding, the web page would suddenly miss a letter here,
a letter there. Not rejecting it poses no security risk, so, for this
specific application it is better to accept it (and correct it) than
to reject it.

Cheers,
Adam

-- 
Don't send me spam, I'm a vegetarian



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT