Wednesday, January 25, 2017

Fighting XSS with 🛡 Isolated Scripts

TL;DR: Here's a proposal for a new way to fight Cross-Site Scripting vulnerabilities called Isolated Scripts. You have an open-source prototype to play with the idea. Please let me know what you think!

Summary

In the aftermath of all the Christmas' CSP bypasses, a discussion came up with @mikewest and @fgrx on the merits of using the Isolated Worlds concept (explained below) as a security mitigation to fight XSS. It seemed like an interesting problem, so I spent some time looking into it.

The design described below would allow you (a web developer) to defend some of your endpoints against XSS with just two simple steps:
  1. Set a new cookie flag (isolatedScript=true).
  2. Set a single HTTP header (Isolated-Script: true).
And that's it. With those two things you could defend your applications against XSS. It is similar to Suborigins but Isolated Scripts defends against XSS even within the same page!

And that's not it, another advantage is that it also mitigates third-party JavaScript code (such as Ads, Analytics, etcetera). Neat, isn't it?

Design

The design of Isolated Scripts consists of three components that work together to deliver the Isolated Scripts proposal.
  1. Isolated Worlds
  2. Isolated Cookies
  3. Secret HTML
I describe each one of them below and then show you the demo.

🌍 Isolated Worlds


Isolated Worlds is a concept most prominently used today in Chrome Extensions user scripts, and Greasemonkey scripts in Firefox - essentially, they allow JavaScript code to get a "view" over the document's DOM but in an isolated JavaScript runtime. Let me explain:

Let's say that there is a website with the following code:


The Isolated World will have access to document.getElementById("text") but it will not have access to window.variable. That is, the only thing that both scripts share is an independent view of the HTML's DOM. This isolation is very important, because user scripts have elevated privileges, for example, they can trigger XMLHttpRequests requests to any website and read their responses.

If it wasn't for the Isolated World, then the page could do something like this to execute code in the user script, and attack the user:
document.getElementById = function() {
    arguments.callee.caller.constructor("attack()")();
};
In this way, the Isolated World allows us to defend our privileged execution context from the hostile user execution context. Now, the question is: Can we use the same design to protect trusted scripts from Cross-Site Scripting vulnerabilities?

That is, instead of focusing on preventing script execution as a mitigation (which we've found out to be somewhat error prone), why don't we instead focus on differentiating trusted scripts from untrusted scripts?

The idea of having privileged and unprivileged scripts running in the same DOM is not new, in fact, there are a few implementations out there (such as Mario Heiderich's Iceshield, and Antoine Delignat-Lavaud Defensive JS), but their implementation required rewriting code to overcome the hostile attacker. In Isolated Worlds, normal JavaScript code just works.

So, that is the idea behind Isolated Scripts - provide a mechanism for a web author to mark a specific script as trusted, which the browser will then run in an Isolated World.
An easy way to implement this in a demo is by actually reusing the Isolated Worlds implementation in Chrome Extensions, and simply install a user script for every script with the right response header.

🍪 Isolated Cookies


Now that we have a script running in a trusted execution context, we need a way for the web server to identify requests coming from it. This is needed because the server might only want to expose some sensitive data to Isolated Scripts.

The easiest way to do so would be simply by adding a new HTTP request header similar to the Origin header (we could use Isolated-Script: foo.js for example). Another alternative is to create a new type of cookie that is only sent when the request comes from a Isolated Script. This alternative is superior to the HTTP header for two reasons:
  1. It is backwards compatible, browsers that don't implement it will just send the cookie as usual.
  2. It can work in conjunction with Same-site cookies (which mitigates CSRF as well).
To clarify, the web server would do this:
Set-Cookie: SID=XXXX; httpOnly; secure; SameSite=Strict; isolatedScript

And the browser will then process the cookie as usual, except that it will only include it in requests if they are made by the Isolated Script. And browsers that don't understand the new flag will always include them.

An easy way to implement this in a demo is to instead of using flags, using a special name in the cookie, and refuse to send the cookie except for cases when the request comes from the isolated script.
One idea that Devdatta proposed was to make use of cookie prefixes, which could also protect the cookies from clobbering attacks.

🙈 Secret HTML


What we have now is a mechanism for a script to have extra privileges, and be protected from hostile scripts by running in an isolated execution context, however, the script will, of course want to display the data to the user, and if the script writes it to the DOM, the malicious script would immediately be able to read it. So, for that, we need a way for the Isolated Script to write HTML that the hostile scripts can't read.

While this primitive might sound new, it actually already exists today! It's already possible for JavaScript code to write HTML that is visible to the user, but unavailable to the DOM. There are at least two ways to do this:
  1. Creating an iframe and then navigating the iframe to a data:text/html URL (it doesn't work in Firefox because they treat data: URLs as same-origin).
  2. Creating an iframe with a sandbox attribute without the allow-same-origin flag (works in Chrome and Firefox).
So, to recap, we can already create HTML nodes that are inaccessible to the parent page. The only issue left is perhaps how to make it easily backwards compatible. So we have two problems left:
  • CSS wouldn't be propagated down to the iframe, but to solve this problem we can propagate the calculated style down to the iframe's body, which will allow us to ensure that the text would look the same as if it was in the parent page (note, however that selectors wouldn't work inside it).
  • Events wouldn't propagate to the parent page, but to solve that problem we could just install a simple script that forwards all events from the iframe up to the parent document.
With this, the behavior would be fairly similar to the secret HTML but without providing a significant information leak on to the hostile script. 
An easy way to implement this in a demo is to create a setter function on innerHTML, and whenever the isolated script tries to set innerHTML we instead create a sandboxed iframe with the content, to which we postMessage the CSS and the HTML and a message channel that can be used to propagate events up the iframe. To avoid confusing other code dependencies, we could create this iframe inside of a closed Shadow DOM.
One potential concern for the design of this feature that Malte brought up was that depending on the implementation this could potentially mess up with developer experience, as some scripts most likely assume that code in the DOM is reachable (eg, via querySelector, getElementsByTagName or otherwise). This is very important, and possibly the most valuable lesson to take - rather than having security folks like me design APIs with weird restrictions, we should also be asking authors what they need to do their work.

Demo

Alright! So now that I explained the concept and how it was implemented in the demo, it's time for you to see it in action.

First of all, to emulate the browser behavior you need to install a chrome extension (don't worry, no scary permissions), and then just go to the "vulnerable website" and try to steal the contents of the XHR response! If you can steal them, you win a special Isolated Scripts 👕 T-Shirt my eternal gratitude 🙏 (and, of course 🎤 fame & glory).

So, let's do it!
  1. Install Chrome Extension
  2. Go to Proof of Concept website
There are XSS everywhere on the page (both DOM and reflected), and I disabled the XSS filter to make your life easier. I would be super interested to learn about different ways to break this!

Note that there are probably implementation bugs that wouldn't be an issue in a real browser implementation, but please let me know about them anyway (either on twitter or on the comments below), as while they don't negate the design, they are nevertheless something we should keep in mind and fix for the purpose of making this as close to reality as possible.

In case you need it, the source code of the extension is here and the changes required to disable CSP and enable Isolated Scripts instead is here.

🔍 Analysis

Now, I'm obviously biased on this, since I already invested some time on this idea, but I think it's at least promising and has some potential. That said, I want to do an analysis on it's value and impact to avoid over-promising. I will use the framework to measure web security mitigations that I described in my previous blog post (but if you haven't read it, don't worry, I explain this below).

Moderation

Moderation stands for: How much are we limiting the impact of the problem?

In this case, the impact is extremely easy to measure (modulo implementation flaws).
Any secret data that is protected with Isolated Cookies is only exposed to Isolated Scripts. And any data touched by Isolated Scripts is hidden from XSS. So, the web author gets to decide the degree of moderation it requires.
One interesting caveat to this that Mario brought up, is that conducting phishing attacks using XSS would still be possible, and is very likely to result in compromise with a tiny bit of social engineering.

Minimization

Minimization stands for: How much are we minimizing the number of problems?

We can also measure this. Most XSS, including content sniffing and plugin-based SOP bypasses (even some types of universal XSS bugs!) can be mitigated with Isolated Scripts, but some types of DOM XSS aren't.
The DOM XSS that are not mitigated by Isolated Scripts, are, for example, Angular Template Injections, and eval()-based XSS - that is because they still inherit the capabilities of the Isolated Script.
I would love to hear of any other types of bugs that wouldn't be mitigated by Isolated Scripts.

Substitution

Substitution stands for: How much are we replacing risks with safer alternatives?

We can also quickly measure how much we are replacing the current risks with Isolated Scripts. In particular, adoption seems very easy although it has some problems:
  1. JavaScript code hosted in CDNs can't run in the Isolated World. This is working as intended, but also might limit the ease of deployment. One easy way to fix this is to use ES6 import statements.
  2. Code that expects to have access to "secret content" (like advertising, for example) won't be able to do so anymore and might fail in unexpected ways (note, however that in a browser implementation it might make sense to actually give access to the secret HTML to the Isolated script, if possible).
I would be super interested to hear any other problems you think developers might find in adoption.

Simplification

Simplification stands for: How much are we removing problems, rather than adding complexity?

Generally, we are adding complexity and removing complexity, and it's somewhat subjective to decide whether they cancel each other out or not.
  • On one hand, we are removing complexity by requiring all interactions with secret data to happen through a single funnel.
  • On the other hand, by adding yet another cookie flag and a new condition for script execution, and a new type of DOM isolation we are making the web platform more difficult to understand.
I honestly think that we are making this a bit more complicated. Not as much as other mitigations, but slightly so. I would be interested to hear any arguments on how "bad" this complexity would be for the web platform.

Conclusion

Thank you so much for reading so far, and I hope you found reading this as interesting as I found writing it.

I think a good next step from here is to hear more feedback on the proposal (from both, authors, browsers and hackers!), and perhaps identify ways we can simplify it further, ways to break it, and ways to make it more developer friendly.

If you have any ideas, please leave comments below or on twitter (or you can also email me).

Until next time 😸

Monday, January 23, 2017

Measuring web security mitigations

Summary: This past weekend I spent some time implementing a prototype for a web security mitigation, and I also spent some time thinking whether it was worth implementing as a web platform feature or not. In this blog post, I want to share how I approached the problem, which I hope you find useful, and to hopefully get your feedback about it too.

I've seen amazing progress on controlling the effects of XSS by adopting inherent safety on software engineering (a term which means focusing on completely eliminating the hazard) and I'm fairly confident that is today's most effective way to tackle it. However, there is always value in evaluating other types of controls beyond pure prevention - perhaps moving on to ways to minimize or contain its risk.

The way I see the problem is that the value of a mitigation can be measured by:
  • Impact - How many vulnerabilities was this mitigation designed to control?
  • Difficulty - What is the cost to adopt this mitigation on a system?
Measuring difficulty is somewhat easy, one should just try to apply the mitigation to real-life applications and see how difficult it is, however, measuring impact can be really difficult on a complex system.

The way this problem was approached in other fields was by measuring mitigations and controls across four metrics (wiki):
  • Moderation - How much are we limiting the impact of the problem.
  • Minimization - How much are we minimizing the number of problems.
  • Substitution - How much are we replacing risks with safer alternatives.
  • Simplification - How much are we removing problems, rather than adding complexity.
So, a naive way to look at this problem, is to evaluate the impact a mitigation has across these four metrics.

For example, I am a fan of Suborigins, an idea that aims to limit the impact a single XSS vulnerability can have by creating a more fine-grained concept of an origin. Suborigins is a good example of Moderation. It does not reduce the number of XSS vulnerabilities, it just makes it so that their impact is significantly reduced. On the other hand, we have the Angular sandbox - it aimed to limit the impact of the problem, but doing so effectively was extremely difficult, and eventually, the sandbox was removed completely.

Another good example is Minimization, and a great example of this is X-Content-Type-Options and X-Frame-Options. These are HTTP headers that allow a site owner to opt-out of behavior that can cause clickjacking, or some types of content sniffing vulnerabilities. If a website owner deploys these headers across their whole website then the amount of places that are likely to be affected can be drastically reduced. On the other hand, we have browser XSS filters and Web Application Firewalls. After many years, I think our industry reached consensus that they are not real security boundaries, and we largely stopped considering bypasses as security vulnerabilities.

Then we have mitigations that website owners can take that can Substitute a risky behavior with a less risky alternative. A good example for this is the use of JSON.parse instead of eval(). By providing a safe alternative to parse JSON content, the browsers have allowed website developers to write code that parses structured data without having to fully trust the data provider. On the other hand, we have DOM APIs (createElement / setAttribute / appendChild) as a replacement for innerHTML. While the use of DOM APIs is really safer, it's also so much more difficult and inconvenient that developers just don't use it - if the alternative is not easy to adopt, developers just won't adopt it.

And finally, we reach Simplification. A great example of a good simple API are httpOnly cookies, it does what it says, it restricts cookies so that they are available over HTTP only, and not JavaScript making credential stealing (and persistence) really hard in many cases. On the other hand, Ian Hixie eloquently explained the complexity problem with CSP back in 2009:
First, let me state up front some assumptions I'm making:
  • Authors will rely on technologies that they perceive are solving their problems,
  • Authors will invariably make mistakes, primarily mistakes of omission,
  • The more complicated something is, the more mistakes people will make. 
I think a valuable lesson (in retrospect) is that we should aim for baby steps (like httpOnly) that provide obvious simple benefits, and then build up on that, rather than big complex systems with dubious security benefits.


And that's it =) - The purpose of this blog post is not to make a scorecard of different mitigations and their merits, but rather to propose a common language and framework for us to discuss whether a mitigation is valuable or not. Hopefully, so that we can better focus our efforts on those that make the most sense for the internet.

Thanks for reading, and please let me know what you think below or on twitter!