Bypassing Cross-Site Scripting Using A Proxy

by Jon Davis 13. December 2007 09:46

When I implemented Sprinkle, which is a client-side includes (CSI) system I came up with that doesn't use IFRAMEs, I kept running into the scenario where you may want to fetch HTML from an external web site besides your own. This is sort of what Web 2.0 is all about, being able to mashup the world with not just your crap but everyone else's crap as well.

I threw together a trivial solution. This is ASP.NET-only, I might come up with a PHP-based equivalent. The idea is to implement a really trivial proxy server and cache the data for a period of time. In this particular implementation, I cache it directly into the web Application's in-memory collection.

Here's what using it might look like ..

        <%-- Client-side includes with server-side cross-site proxying --%>
        <script type="text/javascript" src=""></script>
        <div src="proxy.aspx?url=" />
        <%-- Server-side includes with cross-site proxying--%>
        <ssi:ProxyControl runat="server" ID="GoogleInsertion"
            BaseUrl="proxy.aspx?url=" />

In the server-side include implementation, the DetectImposeBase and BaseUrl properties are really just hacks where I force-inject the proxy URL to any src and href element attributes.

If you try to use the above-referenced proxy.aspx file from an external web site, it should fail. The referer header can only be on the same host.

If you try to reference a very large binary file or something, it will fail. Maximum file size is enforced, so as to not overload the Application in-memory collection that hosts the proxy cache.

This implementation doesn't work flawlessly and it's sort of a prototype thing, it only took about an hour to hack together (plus some time I spent struggling with Visual Studio puking on me), but anyway, here it is.


kick it on

Extending XHTML Without A DTD

by Jon Davis 23. September 2007 23:33

Until Sprinkle I never did much with extending the HTML DOM with my own tags or attributes. When XML was introduced several years ago, people tried to "explain" it by just throwing in custom tags in their HTML and saying, "This is how the new semantic web is gonna look like, see?

<books><ol><book><li>My Book</li></book><book><li>My Other Book</li></book></ol></books>

Of course, that's not the greatest example, but at any rate, from this came XHTML which basically told everyone to formalize this whole XMLization of HTML markup so that custom tags can be declared using a strict DTD extention methodology. Great idea, only instead of picking the ball up and running with it for the sake of extensibility, people instead ran the other way and enforced strictness alone. So XHTML turned out to be a strictness protocol rather than an extensibility format.

Literally, even the latest, shiniest new web browsers, except for Opera (congratulations, Opera) have trouble dealing with inline XHTML extensions. At, the following at the top of the document causes a problem:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ""[ <!ATTLIST div src CDATA #IMPLIED > <!ATTLIST div anticache CDATA #IMPLIED > <!ATTLIST div wraptag CDATA #IMPLIED > <!ATTLIST div apply CDATA #IMPLIED > <!ATTLIST input anticache CDATA #IMPLIED > <!ATTLIST input apply CDATA #IMPLIED >]>

The problem? Just go try and run that and you'll see what the problem is. The stupid web browser doesn't even speak XHTML. It sees those ATTLIST tags and thinks aww heck this must be malformatted HTML 4.01 markup, so it tries to "clean" it up in-memory by closing out the DOCTYPE before it reaches the "]>". So, when it does reach the "]>", it thinks, "Huh. Odd. What's that doing here? I haven't reached a <body> tag yet. That must be a markup error. I'll just go and 'clean' that up by moving it to the top of the body." So it gets rendered as text.

If you do a Javascript alert(document.body.innerHTML); you'll see that it became content rather than treated as an XHTML pre-parser definition. W3C validator thinks it's just hunky dory, but IE7 / FF2 / Safari 3 simply don't have a clue. (Morons.)

But heck. It handles the custom tags without the declaration just fine. These browsers don't balk at the Sprinkle script when the XHTML extensions aren't declared. And the breaking point is just extra content, right?

So I "fixed" this by simply clearing that ugly bit out. Here we go:

function dtdExtensionsCleanup() { // tested on MSIE 6 & 7, Safari 3, Firefox 2 if ((document.body.innerHTML.replace(/ /g, '').replace(/\n/g, "").substr(0, 5) == "]&gt;") ||  ( document.body.innerHTML.substr(0, 11) == "<!--ATTLIST" ||   document.body.innerHTML.substr(0, 11) == "<!--ELEMENT" )) {  var subStrStartIndex = document.body.innerHTML.indexOf("&gt;",    document.body.innerHTML.indexOf("]"));  var subStrHtml = document.body.innerHTML.substring(subStrStartIndex + 4);  document.body.innerHTML = subStrHtml; } else {  // Opera 9.23 "just works" }}

kick it on

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , ,

Web Development

Sprinkle Javascript library

by Jon Davis 13. September 2007 16:06
<script src="sprinkle.js"></script>
<div src="info.html"></div>

CSI (Client-Side Includes), when SSI (Server-Side Includes) is not available. You can also call it "sprinkle", as that's the name I gave the Javascript library. 

kick it on

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

Computers and Internet | Software Development | Web Development


Powered by BlogEngine.NET
Theme by Mads Kristensen

About the author

Jon Davis (aka "stimpy77") has been a programmer, developer, and consultant for web and Windows software solutions professionally since 1997, with experience ranging from OS and hardware support to DHTML programming to IIS/ASP web apps to Java network programming to Visual Basic applications to C# desktop apps.
Software in all forms is also his sole hobby, whether playing PC games or tinkering with programming them. "I was playing Defender on the Commodore 64," he reminisces, "when I decided at the age of 12 or so that I want to be a computer programmer when I grow up."

Jon was previously employed as a senior .NET developer at a very well-known Internet services company whom you're more likely than not to have directly done business with. However, this blog and all of have no affiliation with, and are not representative of, his former employer in any way.

Contact Me 

Tag cloud


<<  May 2021  >>

View posts in large calendar