High Performance Web Brute-Forcing 🕸🐏

Finding and exploiting unique attacks on web applications is, of-course, satisfying. But I also find that performing the most basic of attacks, but as efficiently and effectively as possible, can also pose a decent mental challenge that’s equally rewarding.

In this short post i’ll show you how writing just a few lines of code can have immense gains on web request brute-force attacks, versus using the tools you would probably reach for right now (let’s be honest, it’s Burp).

The task shares huge commonality with offline password cracking; where performance and strategy are everything. Much like a lot of my colleagues who are totally hooked on password cracking, i find the problem of effective web brute-forcing a seriously under-appreciated art.

As a rather contrived example, let’s say we wanted to brute-force Wikipedia pages looking for the word ‘Luftballons’.

We’ll start with our base URL of https://en.wikipedia.org/wiki/(that’s a zero), and increment 0 until we find ‘Luftballons’, on page 99.

Lets see that attack in python using the Requests module:

Execution time: 13.9255948067 seconds. Horrendously slow.

Now, I know what you might be thinking… is Requests too high an API to work at speed? Is it bloated and slow compared to say, using raw sockets or something from the standard libary? Well, absolutely not. For a starter, Request is built on the speedy urllib3, but comes with a bunch of smart benefits we’re already taking advantage of without realising:

  • The gzip and deflate transfer-encodings are supported, so we can receive compressed server responses. This means there is less data on the wire, and we can move more of it in the same amount of time. The benefit is far superior to the processing time required to pack and unpack the server responses.
  • Persistent DNS. Contrary to what I have read on StackOverflow, using Requests with a single TCP connection does not appear to trigger DNS resolution on each request, it seems to do it once. Imagine having to do a full DNS resolve for each request, as some libraries might, the performance hit would be significant.

The problem then, is we are just using Requests really inefficiently.

It doesn’t seem to be common knowledge, but Burp opens up a new TCP connection for every single Intruder request, which has a huge overhead on long brute-force attacks. This is what our script was doing too. Lets see what happens if we modify it to reuse the same connection:

Execution time: 3.16235017776. Much, much faster.

Now if we repeat this attack in Burp, it’ll still have a considerable edge… why? because of threads.

For a short attack like this, Burp’s default of 5 threads keeps it in line with even highly efficient code. But the longer the attack runs, the greater the time wasted to creating new TCP connections. A few hours into an attack and you’ve wasted lots of time.

When Burp says it has 5 thread, what it means is that it can make 5 simultaneous requests via their own connections. But we only have one connection, so lets implement 5 threads that reuse that one connection in our example:

Execution time: 0.93794298172. Very fast. Under the same conditions, this will stomp all over Burp; and pretty much anything else you can expect to make without considerable effort.

Room for improvement? sure!:

So the main problem with Request, and almost all http libraries, is that they don’t support HTTP Pipelining. HTTP Pipelining is the idea of firing multiple requests through a single TCP connection, without having to wait for each response synchronously. If you look at our last code snippet, it looks like thats exactly what we are doing, but unfortunately we’re not. The Requests library actually locks a TCP connection until it has fully read the response content from the last request. The main reason we are able to get such a big perfomance boost from threads, is that we already have our next requests queued up on the connection and ready to fire the moment it’s available to use by the next worker thread. We’ve effectively just minimised the delay this connection sharing was causing us. Pipelining has its own issues, for example its not supported on all webserver, and connection issues are much harder to deal with if you have bits of multiple requests already in transit.

To get around these limitations but still reap the performance of asynchronous requests, we can do one obvious thing: increase the amount of connections.

We can wrap our last code snippet into 5 threads of its own. This gives us 5 TCP connections, each working as fast as possible to synchronously fire out requests. This is as close we can easily get to HTTP pipelining, but is arguably a far more stable attack.

If you really want to play with true pipelining, take a look at Ruby’s em-http-request.

Hopefully this gives you some ideas of how to script basic, yet efficient, brute-force attacks. Don’t assume that because a tool already exists for a job that it means it does it best. As a pen-tester, time is precious and we need to spend it wisely.

-Hiburn8

Note: So burp has no time measurement feature in Intruder, so I created a hack to figure out roughly how fast burp is at making requests. Essentially, I created a jython plugin which registers an extension-generated payload for use in Intruder. When this plugin is called upon to create a payload, it returns an empty string payload, but logs the current time in microseconds to the plugin console. This doesn’t give us the exact that time requests were issued or completed… but does help us figure out how fast burp is generating requests to send, which, alone, is twice as slow as the last example here in all of my test cases.