re: Filtering and displaying untrusted UTF-8

From: verdy_p (verdy_p@wanadoo.fr)
Date: Sun Dec 27 2009 - 20:48:31 CST

Next message: verdy_p: "Re: HTML5 encodings"

Previous message: - -: "Filtering and displaying untrusted UTF-8"
In reply to: - -: "Filtering and displaying untrusted UTF-8"
Next in thread: Asmus Freytag: "Re: Filtering and displaying untrusted UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

crossroads0000@googlemail.com wrote:
> My final question is this: which of the (in the previous steps)
> allowed code points ***higher than*** 127 do I have to "HTML encode"
> if I display them in an HTML page? None? Or is it possible that
> characters with code points outside the US-ASCII range may be
> interpreted by the browser in a similar way to < & and > in the
> US-ASCII range, thereby allowing for an XSS attack?

May be the NEXT LINE (U+0085) character, in C1 controls, part of all ISO 8859 charsets (for MIME) at position 0x85,
which is valid as a line separator or as a blank in HTML?
You may want to replace it with CRLF sequences, or you may want to uniformize the various encodings of newlines (CR
not followed by LF, CR+LF, LF not following CR, NL) into a single one (such as LF, for compatibility with C language
standard I/O) on input (and generate CR+LF on output).

Next message: verdy_p: "Re: HTML5 encodings"
Previous message: - -: "Filtering and displaying untrusted UTF-8"
In reply to: - -: "Filtering and displaying untrusted UTF-8"
Next in thread: Asmus Freytag: "Re: Filtering and displaying untrusted UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sun Dec 27 2009 - 20:51:15 CST