McTrain · 2014/09/27 13:02

0x00 is written first


This is the first time to post an article on WooYun. I don’t know if it is in line with the guests’ taste.

This post, translated into my blog, introduces an attack called BROP, which covers the principle part, and a re-enactment of this attack can be found in another blog post.

The BROP attack is based on an Oakland 2014 paper Hacking Blind by Andrea Bittau of Stamford. Here are links to the paper and Slide:

paper

Slide.

And BROP’s original website address:

Blind Return Oriented Programming (BROP) Website

I can say that this paper is the most exciting paper I have read this year (no one). If I have to use a word to describe it, only “can’t be more handsome” can express my love for it!

This article assumes that you already know the basic concepts of return-oriented Programming (ROP), so it only introduces the implementation principles of BROP.

The implementation of BROP is really “cool” and “smart,” and I hope to make it clear in this article.

0x01 BROP Attack target and prerequisites


Target: Using ROP methods to remotely attack an application and hijack the control flow of the application. We do not need to know the source code or any binary code of the application, the application can be protected by existing protection mechanisms such as NX, ASLR, PIE, and Stack Canaries, and the application can be on a 32-bit or 64-bit system.

At first glance, this goal seems particularly difficult to achieve. In fact, this attack has two preconditions:

  • There must be a known stack Overflow vulnerability, and the attacker knows how to trigger it.
  • Server processes can be resurrected after a crash, and the resurrected process is not re-rand (meaning that the resurrected process has the same address randomization as the previous process, despite being protected by ASLR). This requirement is actually reasonable, because current server applications like Nginx, MySQL, Apache, OpenSSH, Samba, and so on all conform to this feature.

0x10 Attack Flow of BROP 1 – Remote memory dump


Since we don’t know the memory layout of the attacked program, the first thing to do is somehow dump the program’s memory locally from the remote server. To do this we need to call a system call write, passing in a socket file descriptor, as shown below:

write(int sock, void *buf, int len)

Convert this system call into four assembly instructions, as shown below:

So from the perspective of ROP attacks, we just need to find four gadgets, construct the memory addresses of these gadgets on the stack, and call them sequentially.

The problem is that we don’t even know the memory distribution, so how do we find these four gadgets in memory? This seems even more difficult when protection mechanisms such as ASLR and Stack Canaries are deployed.

So let’s put that on the back burner, keep that goal in mind, and do some preparatory work.

Break Stack Canaries protection

If you don’t know what a stack canaries is, simply put a randomly generated number (called canary) below the return address on the stack and check it when the function returns, If this canary is found to have been modified (perhaps overwritten by an attacker using an attack such as buffer overflow), then an error is reported.

So how do you break through this layer of protection? One method is brute-force brute force, but this is inefficient. Here the authors propose a method called stack reading:

Suppose this is the stack layout we want overflow to have:

We can try as many times as we want to determine the overflow length (until the process crashes due to canary crash, which in this case is 4096+8=4104 bytes), then fill in the 4096 bytes with any values, and then try to restore the true canary, byte by byte. For example, if we fill the 4097th byte with x, if x is the same as the first byte in the original canary, then the process doesn’t crash, otherwise we try the next possibility of x, in this case, since there are only 256 possibilities in one byte, So we only have to try 256 times to find the correct byte for canary until we get eight full canary bytes, as shown below:

We can also use this method to get the saved Frame Pointer and return Address.

Looking forstop gadget

So far, we’ve got the right canary to circumvent the protection of the Stack Canary, and the next goal is to find the four mentioned before.

Before we look at these specific gadgets, we need to introduce a specific type of gadget: Stop gadgets.

In general, if we overwrite the return address on the stack into some random memory address, there is a good chance that the program will crash (for example, if the return address points to a code area where some access to a null pointer will cause the program to crash, Thus, the attacker’s connection is closed. However, there is another case, that is, the return address points to a code area. When the execution flow of the program jumps to that area, the program does not crash, but enters an infinite loop. In this case, the program just hangs there and the attacker can keep the connection state. As a result, we call this type of gadget a Stop gadget, which is crucial for finding other gadgets.

Find useful gadgets

Suppose we now found a Stop gadget that causes a block to live, such as an infinite loop, or a blocking system call (sleep). How do we find other Useful Gadgets? (By “Useful” I mean gadgets with some functionality, not crash-causing gadgets).

So far we have only been able to manipulate the stack, and only by overwriting the return Address for subsequent operations. Suppose now we guessed a Useful gadget, like Pop Rdi; Ret, but since the process will skip to the next address on the stack after executing the gadget, if that address is an invalid address, the process will crash. In this process, the attacker does not know that the Useful gadget has been executed (because in the attacker’s view, the final effect is crash), so the attacker will think that no Useful gadget has been executed during this process and will abandon it. This step is shown in the following figure:

However, if we had the Stop gadget, the whole process would have been very different. If we fill in enough Stop Gadgets after the return address we need to try, it looks like this:

Any gadget that causes a crash will still crash the process, and useful gadgets will go into a block state. However, there is a special case where the gadget we want to try is also a Stop gadget, and as described above, it will also be identified as useful gadget. This doesn’t matter, though, because later we still need to check that the Useful gadget is the one we want.

Last step: Remotely dump memory

So far, it seems that all the preparations have been made. We can bypass the Canary guard and get a lot of “potential useful gadgets” that don’t cause crashes. Then how do we find the four gadgets we mentioned earlier?

In order to find the first two gadgets, as shown above: Pop % Rsi; Ret and pop % rdi; Ret, we just need to find a so-called BROP gadget, which is quite common. All it does is restore those Callee saved registers. An offset generates pop % RDI and POP % RSI gadgets.

It’s unfortunate. Ret gadget is not easy to find, it rarely appears in code, so instead of looking for the pop % RDX directive, the author suggests using a STRCMP function call that assigns the length of the string to % RDX to achieve the same effect. In addition, both STRCMP and write calls can be found in the program’s Procedure Linking Table (PLT).

So the next task is:

  • Find the so-calledBROP Gadget;
  • Find the corresponding PLT entry.

Looking forBROP Gadget

In fact, BROP Gadgets is special in that it needs to pop six values sequentially from the stack and then perform ret. If we use the stop gadget method, we can easily find this special gadget. We just need to fill in 6 addresses before the stop gadget that will cause the crash:

If any Useful gadget meets this requirement and doesn’t crash, it’s basically a BROP Gadget.

Looking for PLT entries

PLT is a jump table, usually located at the beginning of an executable program. This mechanism is used to call external functions (liBC, etc.) to an application. See the Wiki for details. It has a very unique signature: each entry is 16-byte aligned, where the 0th byte begins with the fast path of the entry’s corresponding function and the 6th byte begins with the slow path of the entry’s corresponding function:

In addition, most PLT entries will not crash because of incoming parameters, because most of them are system calls that check the parameters and return EFAULT if there is an error, which does not cause the process to crash. So an attacker can find a PLT by using the following method: If an attacker finds that multiple consecutive 16-byte aligned addresses do not crash the process, and that adding 6 to these addresses does not crash the process, then it is likely to be a PLT item.

So when we get a PLT item, how do we determine whether it is STRCMP or write?

For STRCMP, the method proposed by the author is to judge the result returned by the method call after passing different parameter combinations. Thanks to the BROP gadget, we can easily control the first two parameters. STRCMP has the following possibilities:

arg1 | arg2 | result
:--: | :--: | :--:
readable | 0x0 | crash
0x0 | readable | crash
0x0 | 0x0 | crash
readable | readable | nocrash
Copy the code

With this signature, we can probably find the PLT entry for STRCMP.

For write calls, although there is no such signature, we can check if write is called by checking all PLT entries and triggering them to write data to a socket. If write is called, we can see the content passed locally.

The final step is how to determine the number of socket file descriptors to pass to write. There are two approaches: 1. Call write several times at the same time, string them together, and pass in a different number of file descriptors; 2. Open multiple connections at the same time and use a relatively large number of file descriptors to increase the likelihood of a match.

At this point, an attacker can write the entire.text segment from memory to the local socket, decompile it, find more gadgets, and dump symbol tables. Find other corresponding functions in PLT such as Dup2 and execve.

0x11 BROP Attack Flow 2 – Attack execution


So far, the most challenging part has been solved. We can now get the entire memory space of the attacked process.

  • Redirects the socket to standard input/ Output. It can be used by attackersdup2orcloseTo keep pace withduporfcntl(F_DUPFD). These are generally found in the PLT.
  • Find it in memory/bin/sh. An effective way to do this is to find a writable memory region from the Symbol table, for exampleenvironAnd then through the socket will/bin/shRead from the attacker.
  • execveIf the shell.execveNot on the PLT, the attacker will need to make more attempts to find onepop rax; retandsyscallThe gadget.

To summarize, the entire process of a BROP attack looks like this:

  • Using a known stack Overflow vulnerability and stack reading to bypass the Stack Canary protection, try out a return address that is available.
  • Looking forstop gadget: Normally this would be the address of a blocking system call in the PLT (sleep, etc.). In this step, the attacker can also find a valid entry in the PLT;
  • Looking forBROP gadget: After this step the attacker can take controlwriteThe first two arguments of the system call;
  • Signature was used to find the one on the PLTstrcmpItem, and then gives by controlling the length of the string%rdxAssignment, after which the attacker can take controlwriteThe third parameter of the system call;
  • Look for the PLTwriteAfter this step, the attacker can dump the entire memory from the remote side to the local side to find more gadgets.
  • With this information in hand, you can create a Shellcode to execute the attack.

0 x100 afterword.


That’s how the BROP attack works, recreated in this blog post for those interested.

In fact, the coolest step in the whole attack process is the first step: how to dump memory, after the step is actually a traditional ROP attack. Once you understand the principle, the best way to understand the attack is to look at the source code, which will be very helpful to understand the whole ROP.