Re: Is there Unicode mail out there?

From: Martin Duerst (duerst@w3.org)
Date: Tue Jul 17 2001 - 21:36:03 EDT


At 14:30 01/07/17 -0700, Mark Davis wrote:
> > In that case the content of the field is not text but an octet string,
> > and you need to do something different, like base64-ing it.
>
>The content in the database is not an octet string: it is a text field that
>happens to have a control code -- a legitimate character code -- in it.
>Practically every database allows control codes in text fields. (And why are
>C1 controls allowed? After all, they are even less frequent than C0
>controls.)

Mark - I understand your dissatisfaction. But the C1 controls are not
allowed in HTML4, and according to James Clark, the fact that they are
allowed in XML was an oversight.

Databases can (and should) keep care of their data. There are very
few cases where having control characters in there makes sense.
In the most cases, however, they are errors, and if XML gives an
incentive to fix them, all the better.

I wouldn't want any control codes in a database. Having a control-G
may be funny (the joke as I know it goes back to Don Knuth), but
something like a control-S is too much of a risk.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Wed Jul 18 2001 - 02:28:09 EDT