Re: UTF-8N?

From: John Cowan (jcowan@reutershealth.com)
Date: Tue Jun 20 2000 - 13:44:16 EDT


Juliusz Chroboczek wrote:

> Later on, you merge the two files, and compute the checksum of the
> concatenated file. If the program used for splitting inserted a BOM,
> but the program used for merging didn't remove it, the checksum
> comparison is going to fail.

Even worse:

If the split point happened to be before or after a SPACE U+0020 that some
program rendered as a line-break, the line-break will
move somewhere else, because U+FEFF next to U+0200 is not a line-break
opportunity.

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT