RE: Fun with proof by analogy, was Re: Mojibake on my Web pages

From: Jill Ramonsky (Jill.Ramonsky@Aculab.com)
Date: Tue Sep 30 2003 - 06:44:50 EDT

Next message: Jill Ramonsky: "RE: Internal Representation of Unicode"

Previous message: Michael Everson: "Re: Chinese "departing" tone marks"
Maybe in reply to: jon@spin.ie: "Fun with proof by analogy, was Re: Mojibake on my Web pages"
Next in thread: Rick McGowan: "Re: RE: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Maybe reply: Rick McGowan: "Re: RE: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Reply: Peter Kirk: "Re: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Good point. But there has to be an actual attacker here, as in, a hacker
engaged in a purposefully malevalent attempt to (say) run arbitrary code
on a victim's machine (the victim being an end-user, a web-page
viewer). To achieve this, the attacker must exploit "features" of the
victim's browser. Yes, I was assuming that the attacker was a document
author -- but if the attacker was a server (or at least, a server
administrator), then it's difficult to see what a document author can do
to guard against this. If the server is an attacker, they could of
course modify all documents served anyway, in any manner they chose. In
such a circumstance, document authors would be well advised to move
their documents to another server ... assuming they ever found out.

The attack is only theoretical, so far as I know, but basically it works
like this: the attacker places a link to (say)
"C:\WINNT\SYSTEM32\CMD.EXE (plus some nasty parameters)" in a hyperlink
and encourages you to click on it. If all is well, the browser should
forbid this. But if the string is written in encoding A, and the
browser parses it assuming it to be encoding B, it is possible that the
browser may not recognise the path as being absolute, and so may allow
it. Of course, you'd have to try /really hard/ to find encodings A and
B such that this becomes feasable, but you never know, it might be
doable. Plus, you'd have to find a user dumb enough to be running a
sufficiently old browser that it was still prone to this exploit. (I'm
pretty sure modern browsers will have closed that hole by now, but
again, you never know). But even a buggy and stupid browser will never
fall victim to this exploit if the browser is able to infer the correct
encoding for the document.

But look at it like this. Suppose a html document had a meta tag which
claimed: <META HTTP-EQUIV="Content-length" CONTENT=1>. In this
circumstance, which would you prefer to believe: The HTTP Content-length
header? Or the meta tag? (One can certainly imagine buffer-overrun
exploits if browsers were to make the wrong choice).

Of course, having said that, document authors /can/ affect HTTP headers
directly anyway. If the document were to be written in PHP instead of
HTML then a document author could generate any HTTP headers they wanted!
(I've actually done this to deliver documents in UTF-8 against the
server's default). All I can assume is maybe there's some sort of threat
model in place which assumes that anyone who can code in PHP can't
possibly be an attacker! If so, it's clearly nonsense.

I still maintain, though (in agreement with Jon) that a server should
obey the document author by taking notice of meta tags and transforming
them into HTTP tags. (At the very /least/, it should take the meta tag
as a hint, and use it as an HTTP tag if the hint turns out to be true).
To ignore them altogether is just dumb.

Jill

PS. I haven't mentioned Unicode domain names. That's a different kettle
of fish altogether. Maybe we could have another thread for that.

> -----Original Message-----
> From: Peter Kirk [mailto:peterkirk@qaya.org]
> Sent: Monday, September 29, 2003 5:33 PM
> To: Jill Ramonsky
> Cc: unicode@unicode.org
> Subject: Re: Fun with proof by analogy, was Re: Mojibake on
> my Web pages
>
>
> I know I don't understand all the issues here, but I think I spot one
> flaw in the argument. This seems to imply that all security holes are
> the work of the content providers and none related to the servers. In
> other words, that all servers and their administrators are entirely
> trustworthy. This is certainly not necessarily true. And if a content
> provider can compromise security by confusing encodings, so
> can a server.
>
> This could become a significant security hole when we get
> Unicode domain
> names. A malicious server administrator could register the mojibake
> equivalent of a legitimate security sensitive domain name and then
> deliberately serve the mojibake version to users, etc etc.
>

Next message: Jill Ramonsky: "RE: Internal Representation of Unicode"
Previous message: Michael Everson: "Re: Chinese "departing" tone marks"
Maybe in reply to: jon@spin.ie: "Fun with proof by analogy, was Re: Mojibake on my Web pages"
Next in thread: Rick McGowan: "Re: RE: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Maybe reply: Rick McGowan: "Re: RE: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Reply: Peter Kirk: "Re: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Sep 30 2003 - 07:43:38 EDT