Re: UTF-8 Corrigendum, new Glossary

From: Kevin Bracey (kevin.bracey@pace.co.uk)
Date: Thu Nov 30 2000 - 04:31:17 EST


In message <200011300644.WAA03560@unicode.org>
          "G. Adam Stanislav" <adam@whizkidtech.net> wrote:

> At 21:08 29-11-2000 -0800, Mark Davis wrote:
> >1. The Unicode Technical Committee has modified the definition of UTF-8 to
> >forbid conformant implementations from interpreting non-shortest forms for
> >BMP characters,
>
> I find this silly. That creation of such forms would be forbidden I can see
> and agree to. But interpretation? I understand the reasoning when security
> is an issue. But why make it flat illegal? There are many applications
> where such a sequence poses no security danger.
>

Consistency. If some implementations won't read the non-shortest forms and
some will, you end up in the mess that HTML has fallen into due to lack of
rigorous parsing. "This file is illegal." "But it works on my system!"

-- 
Kevin Bracey, Principal Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.pace.co.uk/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT