Showing posts with label javascript. Show all posts
Showing posts with label javascript. Show all posts

Tuesday, March 19, 2019

[🌐💧💥] HTTP Cache Cross-Site Leaks

In this blog post I want to talk about a cool type of attacks (XSLeaks) that are cooler than what most developers and security researchers might realize.

Almost 10 years ago, Chris Evans described an attack against Yahoo! Mail in which a malicious website could search the email inbox of a visitor to his website, and know if the search had returned results or not. This essentially could have allowed him to search the emails of the user word for word, and get to know a lot about the emails received by the user, from who, and when.

Chris did that by simply checking how long the server would take to respond to a search query (done through the victim's browser so it includes cookies), and it concluded that if it took longer to return, it must have been because the search query had results, and if it returned faster, it probably had no results.

the server has at least a 40ms difference in minimum latency between a query for a word not in the index, and a query for a word in the index
The attack was cool, but since it was based on network timing, it was a bit hard to pull off. Six years later, Nethanel Gelernter and Amir Herzberg went a bit deeper into this attack and named it XSSearch (and used statistics to make it more reliable). In the years that followed, the attacks steadily improved, going beyond timing, to browser "misfeatures" and bugs that made the attacks a lot more stable up to near-perfection. In other words, detecting if a search query has results or not is now almost trivial to accomplish through a variety of tricks.

Today I want to bring attention to one of the tricks, a very reliable mechanism to query the HTTP Cache in all browsers (with caveats). As far as I know, previous attacks in this area relied on timing (cached resources load faster than non-cached resources), and were mostly used for figuring out the victim's browser historygeographic location or fingerprinting.

However, while those attacks were interesting, an interesting variation is:
  1. Delete the cache for a specific resource (or list of resources).
  2. Force the browser to render a website (navigation, or prerender).
  3. Check if the browser cached the resource deleted at (1).
Note that what this variation gives you, is that it allows you to figure out if a website loads a specific resource (image, script, stylesheet, etc) or not. In other words, you can just ask the browser a question like:
When the user opens this website: https://www.facebook.com/me/friends will the profile picture of Chris Evans be requested? Or, in other words, am I friends with Chris Evans on Facebook?
Fortunately, Facebook seems to sign their URLs, so you probably can't do this attack so simply, but how about this instead?
When the user opens this website: https://www.facebook.com/groups/bugbountygroup/about will this script https://static.xx.fbcdn.net/rsrc.php/v3/yb/r/xxx.js be requested? Or in other words, do I have access to the Facebook bugbounty group?
This other attack would be more interesting, as long as there is a resource that loads if the user has access, but doesn't load if the user doesn't have access, then you can figure out if a user has access to something or not.

Again, fortunately Facebook actually preloads all scripts and images regardless of whether the user has access to private groups or not.

Now, the same technique applies to search results! You can query the browser and ask:
When the user opens this website: https://www.facebook.com/messages/?qa=indonesia will this script be requested? Or in other words, has the user talked about visiting Indonesia?
Again, fortunately Facebook actually doesn't issue search queries on messages and requires the user to "confirm" the search query :-).


As you can see, some websites have deployed protection against Cross-Site Leaks in the past year, some more effective than others, and I think that this is one of those few attacks that only large websites try to protect against, but most of the internet is still vulnerable to.

So, now that I've described the attack, here's how to do it! The summary of the trick is that it allows you to do two things:
  1. Delete the HTTP Cache for a specific resource or URL
  2. Query the HTTP Cache to see if the browser cached it
To delete the HTTP Cache, you just have to either issue a POST request to the resource, or use the fetch API with cache: "reload" in a way that returns an error on the server (eg, by setting an overlong HTTP referrer), which will lead to the browser not caching the response, and invalidating the previous cached response.

Then after you navigate the user to the site you want to query (either through link rel=prerender, or by navigating another window or frame), you check if the resource was cached or not. You can check if the resource was cached or not by doing the same trick with the overlong HTTP referrer, because if the resource was cached, it will load successfully, and if not, it'll fail. You can see an example of this attack here (that terjanq was nice enough to setup :-).

Here's a nicer explanation of the attack (with a happy cloud and everything):

Depending on who you are, you might either be excited about this, depressed, angry, or ambivalent, here are some suggestions on how to deal with this:
  1. If you are a website author, you might want to think about whether you have any of these leaks, and maybe try to protect against them (see defenses below).
  2. If you are a security researcher, you might want to check the websites you use to see if they are vulnerable (check caveats below).
  3. If you are a browser vendor, you might want to consider implementing double keyed cache (see caveats below).
You probably are wondering what are these caveats that I've been talking about, so here we go:
  • The attack is "complicated" for Safari users, the reason is because Safari has this thing called "Verified Partitioned Cache", which is a technique for preventing user tracking, but that also accidentally helps with this. The way it works is by keying the cache entries to their origin and to the site that loaded the resource. The attack is still possible (because the caching behavior is based on heuristics), but the details of this are probably worth a different blog post.
  • Chrome will hopefully not be vulnerable anymore - the reason is because Chrome is experimenting with "Split Disk Cache", which is somewhat different from Safari's, but has the side-effect of protecting against this attack. Note that this feature is currently behind a flag in Chrome (--enable-features=SplitCacheByTopFrameOrigin), so test it out and send feedback to Chrome =).
  • Firefox users are vulnerable, but they have a preference they can enable to get similar behavior - this is called "First Party Isolation", and is available as an add-on and as a pref (privacy.firstparty.isolate=true). It takes a similar approach to the one implemented in Chrome a few steps further, and splits not only cache but several other things (such as permissions!), so test it out too, and send feedback to Firefox.
And if you are a web developer and are thinking about ways to defend against this, well, I have good news and bad news:
  • You can just disable HTTP cache. This has some bandwidth and performance consequences, though, so maybe don't do that.
  • You can add CSRF tokens to everything. This breaks all bookmarks that your users might have set, so maybe don't do that.
  • You can use SameSite=strict cookies to authenticate users. This is actually quite surprisingly very well supported across browsers, and doesn't break bookmarks. Note, however, that there are some known bypasses (eg, if the site has some types of open redirects, as well as browser implementation bugs).
  • You can use COOP to slow down attackers (so every attack requires a click). Note however that it is only implemented in Firefox, and even in Firefox is behind a pref (browser.tabs.remote.useCrossOriginOpenerPolicy), so test it out and send feedback to Firefox =).
  • You can do all the crazy things that Facebook apparently tries to do to protect against this! Or take a look at this page with some more ideas.
🌐💧💥

I want to end this blog post by saying that HTTP cache is not the only leak there is, there are a ton more! So protecting against Cache is not enough, you can also detect the length of the page, the JS execution cycles, the content-type, the number of TCP sockets, and many more. if you are a security researcher, please contribute! The XSLeaks wiki is a joint effort among several security researchers (terjanq, lbherrera_, ronmasas, _tsuro, arthursaftnes, kkotowicz, and me) trying to explore the limits of cross-site leaks, and hopefully by working together we can come up with better attacks :-). Contact me on twitter if you want to contribute (DMs are open).

Thanks for reading!

Wednesday, February 08, 2017

🤷 Unpatched (0day) jQuery Mobile XSS

TL;DR - Any website that uses jQuery Mobile and has an open redirect is now vulnerable to XSS - and there's nothing you can do about it, there's not even patch  ¯\_(ツ)_/¯ .

jQuery Mobile is a cool jQuery UI system that makes building mobile apps easier. It does some part of what other frameworks like Ember and Angular do for routing. Pretty cool, and useful. Also vulnerable to XSS.

While researching CSP bypasses a few months ago, I noticed that jQuery Mobile had this funky behavior in which it would fetch any URL in the location.hash and put it in innerHTML. I thought that was pretty weird, so decided to see if it was vulnerable to XSS.

Turns out it is!

The bug


The summary is:

  1. jQuery Mobile checks if you have anything in location.hash.
  2. If your location.hash looks like a URL, it will try to set history.pushState on it, then it will do an XMLHttpRequest to it.
  3. Then it will just innerHTML the response.
As a strange saving grace, actually, the fact that it tries to call history.pushState first, makes the attack a little bit harder to accomplish, since you can't set history.pushState to cross-origin URLs, so in theory this should be safe.

But it isn't, because if you have any open redirect you suddenly are vulnerable to XSS. Since the open redirect would be same origin as far as history.pushState is concerned.

So.. you want to see a demo, I'm sure. Here we go:
http://jquery-mobile-xss.appspot.com/#/redirect?url=http://sirdarckcat.github.io/xss/img-src.html
The code is here (super simple).

The disclosure


Fairly simple bug, super easy to find! I wouldn't be surprised if other people had found out about this already. But I contacted jQuery Mobile, and told them about this, and explained the risk.

The jQuery Mobile team explained that they consider the Open Redirect to be the vulnerability, and not their behavior of fetching and inlining, and that they wouldn't want to make a change because that might break existing applications. This means that there won't be a patch as far as I have been informed. The jQuery mobile team suggests to 403 all requests made from XHR that might result in a redirect.

This means that every website that uses jQuery Mobile, and has any open redirect anywhere is vulnerable to XSS.

Also, as a bonus, even if you use a CSP policy with nonces, the bug is still exploitable today by stealing the nonce first. The only type of CSP policy that is safe is one that uses hashes or whitelists alone.

The victim

jQuery Mobile is actually pretty popular! Here's a graph of Stack Overflow questions, over time.

And here's a graph of jQuery Mobile usage statistics over time as well:

You can recreate these graphs here and here. So, we can say we are likely to see this in around 1 or 2 websites that we visit every week. Pretty neat, IMHO.

I don't know how common are open redirects, but I know that most websites have them. Google doesn't consider them vulnerabilities (disclaimer, I work in Google - but this is a personal blog post), but OWASP does (disclaimer, I also considered them to be vulnerabilities in 2013). So, in a way, I don't think jQuery Mobile is completely wrong here on ignoring this.

Now, I anyway wanted to quantify how common it is to have an open redirect, so I decided to go to Alexa and list an open redirect for some of the top websites. Note that open redirect in this context includes "signed" redirects, since those can be used for XSS.

Here's a list from Alexa:
  1. Google
  2. YouTube
  3. Facebook
  4. Baidu
  5. Yahoo
I also thought it would be interesting to find an open redirect on jQuery's website, to see if a random site and not just the top might have one, and while I did find they use Trac which has an Open Redirect, I wasn't able to test it because I don't have access to their bug tracker =(.

Conclusion

One opportunity for further research, if you have time in your hands is to try to find a way to make this bug work without the need of an Open Redirect. I tried to make it work, but it didn't work out.

In my experience, Open Redirects are very common, and they are also a common source of bugs (some of them cool). Perhaps we should start fixing Open Redirects. Or perhaps we should be more consistent on not treating them as vulnerabilities. Either way, for as long as we have this disagreement in our industry, we at least get to enjoy some XSS bugs.

If you feel motivated to do something about this, the jQuery team suggested to send a pull request to their documentation to warn developers of this behavior, so I encourage you to do that! Or just help me out spread the word of this bug

Thanks for reading, and I hope you liked this! If you have any comments please comment below or on Twitter. =)

Wednesday, January 25, 2017

Fighting XSS with 🛡 Isolated Scripts

TL;DR: Here's a proposal for a new way to fight Cross-Site Scripting vulnerabilities called Isolated Scripts. You have an open-source prototype to play with the idea. Please let me know what you think!

Summary

In the aftermath of all the Christmas' CSP bypasses, a discussion came up with @mikewest and @fgrx on the merits of using the Isolated Worlds concept (explained below) as a security mitigation to fight XSS. It seemed like an interesting problem, so I spent some time looking into it.

The design described below would allow you (a web developer) to defend some of your endpoints against XSS with just two simple steps:
  1. Set a new cookie flag (isolatedScript=true).
  2. Set a single HTTP header (Isolated-Script: true).
And that's it. With those two things you could defend your applications against XSS. It is similar to Suborigins but Isolated Scripts defends against XSS even within the same page!

And that's not it, another advantage is that it also mitigates third-party JavaScript code (such as Ads, Analytics, etcetera). Neat, isn't it?

Design

The design of Isolated Scripts consists of three components that work together to deliver the Isolated Scripts proposal.
  1. Isolated Worlds
  2. Isolated Cookies
  3. Secret HTML
I describe each one of them below and then show you the demo.

🌍 Isolated Worlds


Isolated Worlds is a concept most prominently used today in Chrome Extensions user scripts, and Greasemonkey scripts in Firefox - essentially, they allow JavaScript code to get a "view" over the document's DOM but in an isolated JavaScript runtime. Let me explain:

Let's say that there is a website with the following code:


The Isolated World will have access to document.getElementById("text") but it will not have access to window.variable. That is, the only thing that both scripts share is an independent view of the HTML's DOM. This isolation is very important, because user scripts have elevated privileges, for example, they can trigger XMLHttpRequests requests to any website and read their responses.

If it wasn't for the Isolated World, then the page could do something like this to execute code in the user script, and attack the user:
document.getElementById = function() {
    arguments.callee.caller.constructor("attack()")();
};
In this way, the Isolated World allows us to defend our privileged execution context from the hostile user execution context. Now, the question is: Can we use the same design to protect trusted scripts from Cross-Site Scripting vulnerabilities?

That is, instead of focusing on preventing script execution as a mitigation (which we've found out to be somewhat error prone), why don't we instead focus on differentiating trusted scripts from untrusted scripts?

The idea of having privileged and unprivileged scripts running in the same DOM is not new, in fact, there are a few implementations out there (such as Mario Heiderich's Iceshield, and Antoine Delignat-Lavaud Defensive JS), but their implementation required rewriting code to overcome the hostile attacker. In Isolated Worlds, normal JavaScript code just works.

So, that is the idea behind Isolated Scripts - provide a mechanism for a web author to mark a specific script as trusted, which the browser will then run in an Isolated World.
An easy way to implement this in a demo is by actually reusing the Isolated Worlds implementation in Chrome Extensions, and simply install a user script for every script with the right response header.

🍪 Isolated Cookies


Now that we have a script running in a trusted execution context, we need a way for the web server to identify requests coming from it. This is needed because the server might only want to expose some sensitive data to Isolated Scripts.

The easiest way to do so would be simply by adding a new HTTP request header similar to the Origin header (we could use Isolated-Script: foo.js for example). Another alternative is to create a new type of cookie that is only sent when the request comes from a Isolated Script. This alternative is superior to the HTTP header for two reasons:
  1. It is backwards compatible, browsers that don't implement it will just send the cookie as usual.
  2. It can work in conjunction with Same-site cookies (which mitigates CSRF as well).
To clarify, the web server would do this:
Set-Cookie: SID=XXXX; httpOnly; secure; SameSite=Strict; isolatedScript

And the browser will then process the cookie as usual, except that it will only include it in requests if they are made by the Isolated Script. And browsers that don't understand the new flag will always include them.

An easy way to implement this in a demo is to instead of using flags, using a special name in the cookie, and refuse to send the cookie except for cases when the request comes from the isolated script.
One idea that Devdatta proposed was to make use of cookie prefixes, which could also protect the cookies from clobbering attacks.

🙈 Secret HTML


What we have now is a mechanism for a script to have extra privileges, and be protected from hostile scripts by running in an isolated execution context, however, the script will, of course want to display the data to the user, and if the script writes it to the DOM, the malicious script would immediately be able to read it. So, for that, we need a way for the Isolated Script to write HTML that the hostile scripts can't read.

While this primitive might sound new, it actually already exists today! It's already possible for JavaScript code to write HTML that is visible to the user, but unavailable to the DOM. There are at least two ways to do this:
  1. Creating an iframe and then navigating the iframe to a data:text/html URL (it doesn't work in Firefox because they treat data: URLs as same-origin).
  2. Creating an iframe with a sandbox attribute without the allow-same-origin flag (works in Chrome and Firefox).
So, to recap, we can already create HTML nodes that are inaccessible to the parent page. The only issue left is perhaps how to make it easily backwards compatible. So we have two problems left:
  • CSS wouldn't be propagated down to the iframe, but to solve this problem we can propagate the calculated style down to the iframe's body, which will allow us to ensure that the text would look the same as if it was in the parent page (note, however that selectors wouldn't work inside it).
  • Events wouldn't propagate to the parent page, but to solve that problem we could just install a simple script that forwards all events from the iframe up to the parent document.
With this, the behavior would be fairly similar to the secret HTML but without providing a significant information leak on to the hostile script. 
An easy way to implement this in a demo is to create a setter function on innerHTML, and whenever the isolated script tries to set innerHTML we instead create a sandboxed iframe with the content, to which we postMessage the CSS and the HTML and a message channel that can be used to propagate events up the iframe. To avoid confusing other code dependencies, we could create this iframe inside of a closed Shadow DOM.
One potential concern for the design of this feature that Malte brought up was that depending on the implementation this could potentially mess up with developer experience, as some scripts most likely assume that code in the DOM is reachable (eg, via querySelector, getElementsByTagName or otherwise). This is very important, and possibly the most valuable lesson to take - rather than having security folks like me design APIs with weird restrictions, we should also be asking authors what they need to do their work.

Demo

Alright! So now that I explained the concept and how it was implemented in the demo, it's time for you to see it in action.

First of all, to emulate the browser behavior you need to install a chrome extension (don't worry, no scary permissions), and then just go to the "vulnerable website" and try to steal the contents of the XHR response! If you can steal them, you win a special Isolated Scripts 👕 T-Shirt my eternal gratitude 🙏 (and, of course 🎤 fame & glory).

So, let's do it!
  1. Install Chrome Extension
  2. Go to Proof of Concept website
There are XSS everywhere on the page (both DOM and reflected), and I disabled the XSS filter to make your life easier. I would be super interested to learn about different ways to break this!

Note that there are probably implementation bugs that wouldn't be an issue in a real browser implementation, but please let me know about them anyway (either on twitter or on the comments below), as while they don't negate the design, they are nevertheless something we should keep in mind and fix for the purpose of making this as close to reality as possible.

In case you need it, the source code of the extension is here and the changes required to disable CSP and enable Isolated Scripts instead is here.

🔍 Analysis

Now, I'm obviously biased on this, since I already invested some time on this idea, but I think it's at least promising and has some potential. That said, I want to do an analysis on it's value and impact to avoid over-promising. I will use the framework to measure web security mitigations that I described in my previous blog post (but if you haven't read it, don't worry, I explain this below).

Moderation

Moderation stands for: How much are we limiting the impact of the problem?

In this case, the impact is extremely easy to measure (modulo implementation flaws).
Any secret data that is protected with Isolated Cookies is only exposed to Isolated Scripts. And any data touched by Isolated Scripts is hidden from XSS. So, the web author gets to decide the degree of moderation it requires.
One interesting caveat to this that Mario brought up, is that conducting phishing attacks using XSS would still be possible, and is very likely to result in compromise with a tiny bit of social engineering.

Minimization

Minimization stands for: How much are we minimizing the number of problems?

We can also measure this. Most XSS, including content sniffing and plugin-based SOP bypasses (even some types of universal XSS bugs!) can be mitigated with Isolated Scripts, but some types of DOM XSS aren't.
The DOM XSS that are not mitigated by Isolated Scripts, are, for example, Angular Template Injections, and eval()-based XSS - that is because they still inherit the capabilities of the Isolated Script.
I would love to hear of any other types of bugs that wouldn't be mitigated by Isolated Scripts.

Substitution

Substitution stands for: How much are we replacing risks with safer alternatives?

We can also quickly measure how much we are replacing the current risks with Isolated Scripts. In particular, adoption seems very easy although it has some problems:
  1. JavaScript code hosted in CDNs can't run in the Isolated World. This is working as intended, but also might limit the ease of deployment. One easy way to fix this is to use ES6 import statements.
  2. Code that expects to have access to "secret content" (like advertising, for example) won't be able to do so anymore and might fail in unexpected ways (note, however that in a browser implementation it might make sense to actually give access to the secret HTML to the Isolated script, if possible).
I would be super interested to hear any other problems you think developers might find in adoption.

Simplification

Simplification stands for: How much are we removing problems, rather than adding complexity?

Generally, we are adding complexity and removing complexity, and it's somewhat subjective to decide whether they cancel each other out or not.
  • On one hand, we are removing complexity by requiring all interactions with secret data to happen through a single funnel.
  • On the other hand, by adding yet another cookie flag and a new condition for script execution, and a new type of DOM isolation we are making the web platform more difficult to understand.
I honestly think that we are making this a bit more complicated. Not as much as other mitigations, but slightly so. I would be interested to hear any arguments on how "bad" this complexity would be for the web platform.

Conclusion

Thank you so much for reading so far, and I hope you found reading this as interesting as I found writing it.

I think a good next step from here is to hear more feedback on the proposal (from both, authors, browsers and hackers!), and perhaps identify ways we can simplify it further, ways to break it, and ways to make it more developer friendly.

If you have any ideas, please leave comments below or on twitter (or you can also email me).

Until next time 😸

Wednesday, May 27, 2015

[Service Workers] Secure Open Redirect becomes XSS Demo

This is the shortest delay between blog posts I've had in a while, but I figured that since my  last post had some confusing stuff, it might make sense to add a short demo. The demo application has three things that enable the attack:

  1. An open redirect. Available at /cgi-bin/redirect?continue=.
  2. A Cache Service Worker. Available at /sw.js.
  3. A page that embeds images via <img crossorigin="anonymous" src="" />.
And the attacker's site has one thing:
  1. A CORS enabled attack page. Available at /cgi-bin/attack.
Let's do the attack then! For it to work we need two things to happen:
  • The service worker must be installed.
  • A request to our attack page must be cached with mode=cors.
Our index.html page will do both of those things for us.
Poison Cache

Image URL:



When you click submit above the following things will happen:
  1. A service worker will be installed for the whole origin.
  2. An image tag pointing to the open redirect will be created.
  3. The service worker will cache the request with the CORS response of the attacker.
If all went well, the page above should be "poisoned", and forever (or until the SW updates it's cache) you will get the XSS payload. Try it (note that you must have first "poisoned" the response above, if you click the button below here before poisoning the cache first, you will ruin the demo for ever since an opaque response will get cached ;):
If the demo doesn't work for you, that probably means you navigated to the attack page before caching the CORS response. If that's the case, to clear the cache.

Note you need Chrome 43 or later for this to work (or a nightly version of Firefox with the flags flipped). Hope this clarifies things a bit!

Monday, May 25, 2015

[Service Workers] New APIs = New Vulns = Fun++


Just came back from another great HackPra Allstars, this time in the beautiful city of Amsterdam. Mario was kind enough to invite me to ramble about random security stuff I had in mind (and this year it was Service Workers). The presentation went OK and it was super fun to meet up with so many people, and watch all those great presentations.

In any case, I promised to write a blog post to repeat the presentation but in a written form, however I was hoping to watch the video to see what mistakes I might have made and correct them in the blog post, and the video is not online yet. Anyway, so, until then, today this post is about something that wasn't mentioned in the talk.

This is about what type of vulnerabilities applications using Service Workers are likely to create. To start, I just want to say that I totally love Service Workers! It's APIs are easy to use and understand and the debugging tools in Chrome are beautiful and well done (very important since tools such as Burp or TamperChrome wouldn't work as well with offline apps as no requests are actually done.) You can clearly see that a lot of people have thought a lot about this problem, and there is some serious effort around making offline application development possible.

The reason this blog post content wasn't part of the presentation is because I thought it already had too much content, but if you don't know how Service Workers work, you might want to see the first section of the slides (as it's an introduction) or you can read this article or this video.

Anyway, so back to biz.. Given that there aren't many "real" web applications using Service Workers, I'll be using the service-worker samples from the w3c-webmob, but as soon as you start finding real service worker usage take a look and let me know if you find these problems too!

There are two main categories of potential problems I can see:
  1. Response and Caching Issues
  2. Web JavaScript Web Development

Response and Caching Issues

The first "category" or "family" of problems are response and caching issues. These are issues that are present because of the way responses are likely to be handled by applications.

Forever XSS

The first problem, that is probably kind-of bad, is the possibility of making a reflected XSS into a persistent XSS. We've already seen this type of problems based on APIs like localStorage before, but what makes this difference is that the Cache API is actually designed to do exactly that!

When the site is about to request a page from the web, the service worker is consulted first. The service worker at this point can decide to respond with a cache response (it's logic is totally delegated to the Service Worker). One of the main use-cases for service workers is to serve a cached-up copy of the request. This cache is programmatic and totally up-to the application to decide how to handle it.

One coding pattern for Service Workers is to respond with whatever is in the cache, if available or, to make a request if not (and cache the response). In other words, in those cases, no request is ever made to the server if the request matches a cached response. Since this Cache database is accessible to any JS code running in the same origin, an attacker can pollute the Cache with whatever it wants, and let the client serve the malicious XSS for ever!

If you want to try this out, go here:

Then fake an XSS by typing this to the console:
caches.open('prefetch-cache-v1').then(function(cache){cache.put(new Request('index.html', {mode: 'no-cors'}), new Response('\x3cscript>alert(1)\x3c/script>', {headers: {'content-type': 'text/html'}}))})

Now whenever you visit that page you will see an alert instead! You can use Control+Shift+R to undo it, but when you visit the page again, the XSS will come back. It's actually quite difficult to get rid of this XSS. You have to manually delete the cache :(.

This is likely to be an anti-pattern we'll see in new Service-Workers enabled applications often. To prevent this, one of the ideas from Jake Archibald is to simply check the "url" property of the response. If the URL is not the same as the request, then simply not use it, and discard it from the cache. This idea is quite important actually, and I'll explain why below.

When Secure Open Redirects become XSS

A couple years ago in a presentation with thornmaker we explained how open redirects can result in XSS and information leaks, and they were mostly limited to things such as redirecting to insecure protocols (like javascript: or data:). Service Workers, however, and in specific some of the ways that the Cache API is likely to be used, introduce yet another possibility, and one that is quite bad too.

As explained before, the way people seem to be coding over the Cache API is by reading the Request, fetching it, and then caching the Response. This is tricky, because when a service worker renders a response, it renders it from the same origin it was loaded from.

In other words.. let's say you have a website that has an open redirect..
  • https://www.example.com/logout?next=https://www.othersite.com/
And you have a service worker like this one:
  • https://googlechrome.github.io/samples/service-worker/read-through-caching/service-worker.js
What that service worker does, as the previous one, is simply cache all requests that go through by refetching and the service worker will cache the response from evilwebsite.com for the other request!

That means that next time the user goes to:
  • https://www.example.com/logout?next=https://www.evilwebsite.com
The request will be cached and instead of getting a 302 redirect, they will get the contents of evilwebsite.com.

Note that for this to work, evilwebsite.com must include Access-Control-Allow-Origin: * as a response header, as otherwise the request won't be accepted. And you need to make the request be cors enabled (with a cross origin image, or embed request for example).

This means that an open redirect can be converted to a persistent XSS and is another reason why checking the url property of the Response is so important before rendering it (both from Cache and on code like event.respondWith(fetch(event.request))). Because even if you have never had an XSS, you can introduce one by accident. If I had to guess, almost all usages of Service Workers will be vulnerable to one or another variation of these types of attacks if they don't implement the response.url check.

There's a bunch of other interesting things you can do with Service Workers and Requests / Responses, mentioned in the talk and that I'll try to blog about later. For now though, let's change subject.

Web JavaScript Web Development

This will sound weird to some, but the JavaScript APIs in the browser don't actually include any good web APIs for building responses. What this means is that because Service Workers will now be kind-of like Web Servers, but are being put there without having any APIs for secure web development, then it's likely they will introduce a bunch of cool bugs.

JavaScript Web APIs aren't Web Service APIs


For starters, the default concerns are things that affect existing JS-based server applications (like Node.js). Things from RegExp bugs because the JS APIs aren't secure by default to things like the JSON API lack of encoding.

But another more interesting problem is that the lack of templating APIs in Service Workers means people will write code like this which of course means it's gonna be full of XSS. And while you could import a templating library with strict contextual auto-escaping, the lack of a default library means people are just likely to use insecure alternatives or just default to string concatenation (note things like angular and handlebars won't work here, because they work at the DOM level and the Service Workers don't have access to the DOM as they run way before the DOM is created).

It's also worth noting that thanks to the asynchronous nature of Service Workers (event-driven and promise-based) mixed with developers that aren't used to this, is extremely likely to introduce concurrency vulnerabilities in JS if people abuse the global scope, which while isn't the case today, is very likely to be the case soon.

Cross Site Request Forgery


Another concern is that CSRF protection will have to behave differently for Service Workers. In specific, Service Workers will most likely have to depend more on referrer/origin based checks. This isn't bad in and off itself, but mixing this with online web applications will most likely pose a challenge to web applications. Funnily, none of the demos I found online have CSRF protection, which remarks the problem, but also makes it hard for me to give you examples on why it's gonna be hard.

To give a specific example, if a web application is meant to work while offline, how would such application keep track of CSRF tokens? Once the user comes online, it's possible the CSRF tokens won't be valid anymore, and the offline application will have to handle that gracefully. While handling the fallback correctly is possible by simply doing different checks depending on the online state of the application, it's likely more people will get it wrong than right.

Conclusion

I'm really excited to see what type of web applications are developed with Service Workers, and also about what type of vulnerabilities they introduce :). It's too early to know exactly, but some anti-patterns are clearly starting to emerge as a result of the API design, or of the existing mental models on developers' heads.

Another thing I wanted to mention, before closing up is that I'll start experimenting writing some of the defense security tools I mentioned in the talk, if you want to help out as well, please let me know!

Saturday, May 31, 2014

[Matryoshka] - Web Application Timing Attacks (or.. Timing Attacks against JavaScript Applications in Browsers)

Following up on the previous blog post about wrapping overflow leak on frames, this one is also regarding the presentation Matryoshka that I gave in Hamburg during HackPra All Stars 2013 during Appsec Europe 2013.

The subject today is regarding web application timing attacks. The idea behind this attack is a couple of techniques on how to attack another website by using timing attacks. Timing attacks are very popular in cryptography, and usually focus on either remote services, or local applications. Recently, they've become a bit popular, and are being used to attack browser security features.


Today the subject I want to discuss is not attacking the browser, or the operating system, but rather, other websites. In specific, I want to show you how to perform timing attacks cross-domain whenever a value you control is used by another web application.


The impact of successfully exploiting this vulnerability varies depending on the application being attacked. To be clear, what we are attacking are JavaScript heavy applications that handle data somehow controlled by another website, such as cookies, the URL, referrer, postMessage, etc..


Overall though, the attack itself isn't new, and the application of the attack on browsers is probably predictable.. but when discussed it's frequently dismissed as unexploitable, or unrealistic. Now, before I get your hopes up, while I think it's certainly exploitable and realistic, it isn't as straightforward as other attacks.


We know that you can obviously know if your own code runs faster or slower, but, as some of you might want to correct me.. that shouldn't matter. Why? Well, because of the Same Origin Policy. In case you are not aware, the Same Origin Policy dictates that https://puppies.com cant interact with https://kittens.com. It defines a very specific subset of APIs where communication is safe (like, say, navigation, or postMessage, or CORS).

The general wisdom says that to be vulnerable to information leak attacks, you need to opt-in to be vulnerable to them. That is, you have to be doing something stupid for stupid things to happen to you. For example, say you have code that runs a regular expression on some data, like:

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        e.source.postMessage('got a match!', e.source);
    }
}

Well, of course it will be possible to leak information.. try it out here:

  • The secret is 0.0
  • The secret is 0.1
  • The secret is 0.2
  • ...
  • The secret is 0.9
Regexp:


You can claim this code is vulnerable and no one would tell you otherwise. "You shouldn't be writing code like this (tm)".

Now, what about this?

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
    e.source.postMessage('finished', e.source);
}

Well, now it's tricky, right? In theory, this script isn't leaking any information to the parent page (except of course, how long it took to run the regular expression). Depending on who you ask, they will tell you this is a vulnerability (or.. not).

Now, it clearly is vulnerable, but let's figure out exactly why. There is one piece of information that is leaked, and that is, how long it took to run the regular expression. We control the regular expression, so we get to run any regular expression and we will get to learn how long it took to run.

If you are familiar with regular expressions, you might be familiar with the term "ReDoS", which is a family of regular expressions that perform a Denial of Service attack on the runtime because they run in exponential time. You can read more about it here. Well, the general premise is that you can make a regular expression take for-ever if you want to.

So, with that information, we can do the same attack as before, but this time, we will infer if we had the right match simply by checking if it took over N seconds to respond to our message, or well... if it returned immediately.

Let's give it a try:


  • The secret is 0.0[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.1[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.2[0-9.]+(.{0,100}){12}XXX
  • ...
  • The secret is 0.9[0-9.]+(.{0,100}){12}XXX
Regexp:



Did you notice that the right answer took longer than the others?

If you are an average security curmudgeon you will still say that the fault is at the vulnerable site, since it's opting-in to leak how long it took to run the regular expression. All developers are supposed to take care of these types of problems, so why not JavaScript developers?

Alright then, let's fix up the code:
onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
}

That clearly has no feedback loop back to the parent page. Can we still attack it?

Well, now it gets tricky. When I told you that the Same Origin Policy isolates kittens.com from puppies.com I wasn't being totally honest. The JavaScript code from both, kittens.com and puppies.com run in the same thread. That is, if you have 3 iframes one pointing to puppies.com, one pointing to kittens.com and one pointing to snakes.com then all of them run in the same thread.

Said in another way, if snakes.com suddenly decides to run into an infinite loop, neither puppies.com nor kittens.com will be able to run in any JavaScript code. Don't believe me? Give it a try!




.

Did you notice that when you looped the "snakes" the counters in the other two iframes stopped? That's because both, the snakes and the puppies and the kittens all run in the same thread, and if one script keeps the thread busy, all the other scripts are paused.

Now, with that piece of information, a whole new world opens upon us. Now, we don't need the other page surrender any information to us. Simply by running in the same thread we do, we can guess how long their code takes to run!

One way to attack this is by simply asking the browser to run a specific piece of code every millisecond, and whenever we don't run, that means there's some other code keeping the interpreter busy at the time. We keep track of these delays and we then learn how long the code took to run.

Alright, but you might feel a bit disappointed, since it's really not common to have code that runs regular expressions on arbitrary content.. so this attack is kinda lame..

Well, not really. Fortunately there are plenty of web applications that will run all sorts of interesting code for us. When I described the attack to Stefano Di Paola (the developer of DOMinator), he told me that he always wanted to figure out what you could do with code that does jQuery(location.hash). This used to be an XSS, but then jQuery fixed it.. so no-more-XSS and if the code starts with a '#', then it is forced as a CSS selector.

He said that perhaps jQuery could leak timing information based on how long it took for it to run a specific selector over the code. I told him that was a great idea, and started looking into it. Turns out, there is a "contains" selector in jQuery that will essentially look at all the text content in the page and try to figure out what nodes "contain" such text.

It essentially finds a node, and then serializes it's text contents, and then searches for such value inside it. In a nutshell it does:

haystack.indexOf(needle);

Which, is interesting, but it doesn't have the ReDoS vector we had with regular expressions. Figuring out if there was a match is significantly more complicated.. We need to know if the "haystack" found something (or not) which sounds, well.. complicated.

Can we get such granularity? Is it possible to detect the difference between "aaaa".indexOf("b") and "aaaa".indexOf("a")? There's clearly a running time difference, but the difference is so small we might not be able to measure it.

There is another, even cooler selector, the "string prefix" selector. Say, you have:

You can match it with:
input[name=csrftoken][value^=X]

But again, this is even more difficult than the indexOf() attack we were doing earlier.. The question now is can we detect string comparisons? Can we get enough accuracy out of performance.now() that we get to know if string comparisons succeed?

To make the question clearer.. can we measure in JavaScript the difference between "wrong" == "right" and "right" == "right"?

In theory, it should be faster to compare "wrong" to "right" because the runtime can stop the comparison on the first character, while to be sure that "right" is equal to "right", then it needs to compare every character in the string to ensure it's exactly the same.

This should be easy to test. In the following iframe we make a quick experiment and measure:

  • Time taken to compare "aaaa.....a" (200k) to "foobar"
  • Time taken to compare "aaaa.....a" to "aaaa.....a"
We make the comparison 100 times to make the timing difference more explicit.

Unless something's gone totally wrong, you should see something like:
aaa..a == foobar : 0ms
aaa..a == aaa..a : 60ms
This strangely looking results mean, mostly, that comparing a very long string takes significantly more time than comparing two obviously different strings. The "obviously", as we will learn, is the tricky part.

What the JavaScript runtimes do is first, they try to take a shortcut. If the lengths are different, then they return false immediately. If they are the same, on the other hand, then they compare char-by-char. As soon as they find a character that doesn't match, it returns false.

But is it significant? And if so, how significant is it? To try and answer that question I ran several experiments. For instance, it's 2-10 times faster to make a "false" comparison, than a "true" comparison with just 18 characters. That means "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx" runs 2 to 10 times slower than "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy". To be able to detect that, however, we need a lot of comparisons and a lot of samples to reduce noise. In this graph you can see how many thousands of iterations you need to be able to "stabilize" the comparison. What you should see in that graph is something like:

What that graph means is that after 120,000 iterations (that is 120,000 comparisons) the difference between "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy" and "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx"  is stable and is 6 times slower. And, to be clear, that is 18 characters being different. This means that to be able to notice the 6X difference you would need to bruteforce the 18 characters at once. Even in the most optimistic scenarios, bruteforcing 18 characters is way out the question.

If you try to repeat the experiment with just 2 characters, the results are significantly different (note this graph rounds up results to the closest integer). You won't be able to notice any difference.. at all. The line is flat up to a million iterations. That means that comparing a million times "aa" vs "bb" runs in about the same amount of time (not as impressive as our 6X!).

But.. we don't always need impressive differences to be able to make a decision (see this graph without the rounding). With just two characters the difference looks roughly like this:
Which.. means it usually runs in exactly the same time, sometimes a tiny bit faster, but also many times slightly slower. In essence, it seems to run at about 1.02 times the speed of a false comparison.

Now, the fact this graph looks so messy means our exploit won't be as reliable, and will require significantly more samples to detect any changes. We now have in our hands a probabilistic attack.

To exploit this, we will need to perform the attack a couple hundred or a couple thousand, or a couple million times (this actually depends on the machine, a slow machine has more measurable results, but also has more noise, a fast machine has less measurable results, but we can do more accurate measurements).

With the samples we either average the results, or get the mean of means (which needs a lot of samples) or a chi squared test if we knew the variance or a student t test if you can assume a normal distribution (JavaScript's garbarge collector skews things a bit).


At the end of the day, I ended up creating my own test, and was able to brute force a string in a different origin via code that does:

if (SECRET == userSupplied) {
   // do nothing
}

Here are some results on string comparison on Chrome:
  • It is possible to bruteforce an 18 digit number in about 3 minutes on most machines.
    • Without the timing attack you would need to perform 5 quadrillion comparisons per second. With the timing attack you only need 1 million.
  • It is possible to reliably calculate a 9 character alphanumeric string in about 10 minutes.
    • This is different than the numbers because here we have 37 characters alphabet, and with numbers it's just 10.
If you are interested in playing around with it, and you have patience (a lot of patience!) check this and this. They will literally take 5-10 minutes to run, and it will simply try to order 7 strings according to the time they took to compare. Both of them run about a million string comparisons using window.postMessage as the input, so it takes a while.

If you have even more patience you can run this which will run different combinations of iterations/samples trying to figure out which works best (slow machines work better with less iterations and more samples, faster machines run best with more iterations and less samples).

So, summary!
  • Timing attacks are easy thanks to JavaScript's single-threadness. That might change with things like process isolation, but it will continue to be possible for the time being.
  • String comparison can be measured and attacked, but it's probabilistic and really slow with the existing attacks.
Let me know if you can improve the attack (if you can get it under 1 minute for a 18 digit number you are my hero!), find a problem in the sampling/data, or otherwise are able to generalize the ReDoS exploit to string comparison / indexOf operations :)

Another attack that might be worth discussing is that strings from different origins are both stored in the same hash table. This means that if we were able to measure a hashtable miss from a hashtable match, we could read strings cross-origin.

I spent a lot of time trying to make that to work, but it didn't work out. If you can do it, let me know as well :)

Thanks for reading!

Sunday, September 22, 2013

[Matryoshka] - Wrapping Overflow Leak on Frames

(Update Feb 25, 2018: Google Drive deprecated web hosting, so the PoCs below don't currently work. I'll fix them soon, in the meantime you can see the source code here)

I just came back from a very fun trip around Europe. Among other places, I visited Hamburg, to attend HackPra 2013, which was hosted in AppSec Europe. In there I gave the presentation Matryoshka - titled after the famous Russian dolls.

Today I'm blogging about one of the subjects of that presentation, an information leak introduced by a "side channel" present in iframes. I didn't give much detail in the presentation since I was afraid I was gonna run out of time (this was just one part of the presentation). This blogpost is meant to add more detail, as well as give a couple more details I wasn't sure worked at the time of the presentation.

A quick summary of the problem is that, under certain circumstances, it is possible to know when text inside an iframe wraps to the next line. Text wrapping is when a line is longer than the width of the area it can be displayed into, so it needs to wrap to a second line.

Being able to detect text wrapping is an interesting problem, as it allows us to learn some information about the framed website, which might be particularly dangerous under some circumstances.

To show a small example, the following iframe is hosted in a different domain than this blog post:


We are disallowed to know what the contents are because of the Same Origin Policy. It's important to understand this, we can iframe anything, including third party sites to which you are authenticated to. This might include sites that contain secret information, like your email inbox, or your bank statements.

Now, let's see what we can do:

  • We can, change the width and height of this iframe at our will.
  • We can navigate the inner iframes from that page.
  • We can change the style of your scrollbars or detect their presence.
To clarify, changing the width and height of the iframe will allow us to force some content to wrap on to the next line. To exploit this we need to know whether the content wrapped or not.

Navigating Child Iframes

By navigating child iframes, we can change an otherwise innocuous iframe (such as a Facebook like button), to a domain we control. We can do this by design, even on cross-domain iframes, look:




The reason for this is because the window.frames[] property is exposed cross-origin, and it's allowed by the Same Origin Policy to navigate child frames, even those that are cross domain.

The reason this matters, is because once we control a child iframe in our target page, we can know the position of the iframe relative to the browser/screen. This will let us know if the content of the page wrapped, as we'll discuss later on.

Detecting when the text wraps to the next line is the corner stone of this attack.

Detecting Iframe Screen Coordinates

In Trident and Gecko based browsers, it is possible to detect the position of an iframe relative to the screen. This is interesting, as it allows us to know exactly when an iframe moves down because of text wrapping.

We detect this with one of two properties, either window.screenTop for Trident based browsers, or window.mozInnerScreenY for Gecko based browsers (and mouse events in general). These properties are only readable from within the target iframe, but as we explained before, it is perfectly possible to navigate our target site child iframes, and once we do that, we can reduce the width of our iframe until text doesn't fit in the line anymore, and moves the rest of the line to the next line, displacing our iframe.

Please note this proof of concept doesn't work in all browsers (notably, it only works in firefox/ie), so it might not work for you, but feel free to give it a try.

Detecting Scrollbars Presence

This is interesting, in some browsers, it is possible to apply CSS to the scrollbars of the iframe. This is important because this would leak whether the text wrapped contains a background-image, which would be requested when the scrollbar is shown. In some browsers you need to change the backgroundImage after the creation of the iframe for the image to be requested.

Please note this proof of concept doesn't work in all browsers (notably, this only only works webkit-based browsers), so it might not work for you, but feel free to give it a try.



What will happen when you click that button is:
  • The width of the iframe will be slowly reduced pixel by pixel.
  • When the word "dusk" wraps to the next line, it will display the vertical scrollbar.
  • When the vertical scrollbar is shown, the background image will be requested.
  • We detect when such requests happen by checking document.cookie.

Measuring Word Width

So far we found out a way to find when text wraps to the next line, either by detecting the presence of a scrollbar, or navigating a child iframe and detecting it's position relative to the screen monitor.

To measure a specific word width, we will follow the following steps:
  1. Resize the target iframe to:
    • width: 9999999px
    • height: smallest-without-scrollbar;
  2. Slowly reduce the width until the text wraps. (You can reduce in fraction of pixels).
    • If you are detecting a scrollbar, ensure to increase the height to make the scrollbar disappear.
    • If you are detecting text wrapped from a child iframe, detect changes from the new current position.
  3. Repeat
    • Record the exact width at which text wrapping happened.
We will actually learn things in a bit of an odd order. For this particular example, we will get the length of the following:
  • First wrapping:       hello, my name is bond, james bond.
  • Second wrapping:   the secret is on the island!
  • Third wrapping:      bond, james bond.
  • Fourth wrapping:    name is bond,
  • Fifth wrapping:       bond, james
  • Sixth wrapping:      the secret
  • Seventh wrapping:  my name
  • Eighth wrapping:    secret is
  • Ninth wrapping:      name is
  • Tenth wrapping:      is on
The exact fraction of pixel in which the line wraps (which by practice I've seen some times needs to be as accurate as 1E-10 pixels), tells us the length of the line.

By calculating the difference between different lines, we can also get the length of other sequence of words:
  1. hello, my name is
  2. hello, my james bond.
  3. hello, my name is bond.
  4. hello, is bond, james bond.
  5. hello, my bond, james bond.
  6. ...
And we can also get "bond" by subtracting (1) to (3). We can also get "james" by subtracting (3) to the first wrapping, etc..  We might not always get specific word length, and rather groups of two words, but the attack works to both cases.

The question now is.. how hard is it to go from the width of the word (or sequence of words) to the actual contents of them?

Turns out that usually, all letters have a different width, and as long as such width is unique, it's almost trivial to calculate which letters are in each wrapping (although we don't get the order of such letters).

The solution to this problem is the classic knapsack problem which I won't go into details of. While we won't be able to get the order of the letters, we can get which letters to a reasonable degree of accuracy, which should be sufficient to have a very good guess of the value.

It's unclear what's the best solution to this problem. This information leak isn't a vulnerability per-se in browsers, but rather a well known and understood feature. This blog post will hopefully trigger some discussion around this subject and we can come up with solutions.

One challenge is that if we stopped providing mozInnerScreenY / screenTop, then it would be significantly harder to detect and protect against clickjacking. So whichever solution we come up with needs to take that into consideration.

It's also worth mentioning what happens when our iframe is so small that a word doesn't even fit. Well, the answer is that individual characters wrap to the next line, and we can then extract the width of each individual character (rather than each word). This, unfortunately, doesn't work that well in normal websites, as by default they don't break words (unless overridden by CSS's break-word). If we are able to get this, however, it would be possible for us to obtain the order of the characters in each word (and completely read the contents of the iframe without having to guess).

And that was it! This attack allows you to steal contents from websites that don't use X-Frame-Options (or have a fixed CSS width) in all major browsers with browser standard functionality (no vuln up my sleeve!). I felt inclined to use an acronym for this attack (WOLF) since Thai Duong/Juliano Rizzo's attacks (POET, BEAST, CRIME) sound better than "cross-BLAH-jacking", and I like wolves (although, not as much as cats), specially since it kind of feels like insanity wolf to me.