Wildsator · 2015/12/09″

Ke Sun([email protected])

Original: http://en.wooyun.io/2015/12/08/33.html

0 x00 the


Obfuscation is a common technique that can increase the difficulty and cost of binary analysis and reverse engineering. Mainstream obfuscation techniques focus on using the same machine code as the target CPU, in the same processor mode, hiding the code and controlling it. In this paper, a new obfuscation method is introduced, which utilizes the 32-bit / 64-bit cross-mode encoding in 64Windows. Case studies of static and dynamic analysis tools demonstrate that this obfuscation technique is simple but effective.

0x01 Mode Conversion in 64-bit Windows


All 64-bit Windows operating systems are backward compatible with unmodified 32-bit applications through the WoW64 (32-bit Windows on 64-bit Windows) subsystem. WoW64 uses three dynamically linked libraries (Wow64.dll, WOW64win.dll, and WOW64CPU.dll) to provide a compatibility layer that acts as an interface between the 64-bit kernel and 32-bit applications {1}, whereby processor modes can be dynamically switched between 32-bit and 64-bit.

At any given time, processor mode is determined by the control bits in the current code snippet (CS) descriptor. Figure 1 is a schematic of a segment descriptor. The L bit (21 bits) is the “64-bit code segment” control flag: “A value of 1 indicates that instructions in this code segment will be executed in 64-bit mode, and a value of 0 indicates that instructions in this code segment will be executed in compatible (32-bit) mode.” {2}

Figure 1- Schematic diagram of segment descriptors {2}

Any application under 64-bit Windows, whether 32-bit or 64-bit, will have at least two code segments when loaded into memory. The 32-bit address space maps of the two code segments overlap exactly, but the segment selectors are different: the selector for one is 0x23, and L is marked 0 (32-bit code segment); The other selector is 0x33, with L marked 1 (64-bit code snippet). In an application’s PE (portable executable) header, whether the label of the “Machine” field is 32-bit or 64-bit PE determines the default code snippet for that application.

Processor mode switching is triggered when a far-branch instruction transfers control flow from one code segment to another with a different L tag. For example, if a 32-bit application is executed on 64-bit Windows, the default CS selector will be 0x23 and the processor will be in 32-bit mode. If a far shift is used to jump to a code segment whose selector is 0x33, the processor mode switches to 64-bit after executing this instruction, as well as when switching from 32-bit to 64-bit.

There are many far-branch instructions that can be used to trigger a mode transition, including (1) far-shift (2) far-call (3) far-return and (4) interrupt return. These commands are like portals between parallel worlds of different dimensions, and can be used to travel between worlds when needed. Because of this convenient controlled mode switching, obfuscation techniques have the opportunity to use cross-mode encoding, which can be used in both 32-bit and 64-bit applications.

The only thing to be careful about is 64-bit applications. The memory addresses used in 64-bit mode need to be limited to less than 2 gigabytes of memory; otherwise, memory address conflicts will occur when switching to 32-bit mode.

0x02 Cross mode confuses the concept


All IA instructions can be divided into two categories: compatible crossover mode instructions and incompatible crossover mode instructions. Compatible cross mode means that the machine code of the instruction is valid in both 32-bit and 64-bit mode and has exactly the same meaning, otherwise the instruction would be considered incompatible cross mode.

Intel’s software developer’s manual lists details of each command and the patterns it can apply. In this instruction table, both the “64-bit Mode” field and the “Compat/Leg Mode” (32-bit) field are displayed as “valid” (similar to “MOV R32, IMM32” in Figure 2), which is a compatibility instruction. If any of these fields are not “valid,” then this is an incompatible instruction. Both types of instructions can be used for cross-mode obfuscation.

Figure 2- Example of compatible and incompatible instructions

2.1 Compatible Instructions

Although the execution of compatible instructions is the same in both 32-bit and 64-bit modes, it is possible that the results will be different given the differences between the two modes. A simple example is shown in Figure 3. As you can see from the figure, disassembling the same set of instructions at run time yields exactly the same results, whether in 32-bit or 64-bit mode.

Figure 3- Example execution of compatible instructions in different modes

“Call” simply calls the next instruction, “mov ebx, esp,” and the code block simply checks how many bytes the stack pointer changed before and after the “call” instruction was executed. Because of the stack frame size difference, in 32-bit mode, the result in EAX is 4; In 64-bit mode, the result in EAX is 8. This proves that the result of the same set of instructions depends on the pattern, which means that an analysis tool can be used to counter a single pattern with obfuscated data or control flow.

2.2 Incompatible instructions

Obfuscation with incompatible instructions is more straightforward, since such instructions are only valid in one mode. An incompatible 64-bit instruction can either be treated as a valid instruction or interpreted as a completely different instruction in 32-bit mode, or vice versa (Figure 4), causing problems for analysis tools that do not support handling dynamic mode transitions.

Figure 4- Example execution of incompatible instructions in different modes

0x03 Case Study Crossover pattern confusion


A very simple example demonstrates the effectiveness of cross-mode obfuscation. Figure 5 is a diagram of the cross-mode code, which is first executed as 32-bit native instructions. The processor is then converted to 64-bit mode by jumping to CS 0x33 using a remote transfer and continues to execute 64-bit native instructions (incompatible in 32-bit mode). After execution, the 64-bit code decodes another set of 32-bit native instructions, which are then run after the processor is switched back to 32-bit mode via a remote return. Eventually, this part of the 32-bit code decrypts the secret key.

Figure 5- Schematic diagram of an example program using cross-mode encoding

Figure 6 shows the result of this cross-pattern code being executed on the command line. The snippet-selector printed out in hexadecimal confirms that the processor has indeed converted from 32-bit to 64-bit, and from 64-bit to 32-bit.

Figure 6- Cross pattern code sample execution on the command line

3.1 Static Analysis Tools

When the above cross-pattern code was analyzed using 32-bit IDA Pro, it became clear that IDA would disassemble 64-bit native code into 32-bit instructions even after schema swaps were performed by far migration, resulting in disassembly results that deviated from the actual 64-bit instructions. (figure 7)

Figure 7- Comparing IDA Pro runtime disassembly with static disassembly

If you use the 64-bit version of IDA Pro, IDA will still analyze this cross-mode code in 32-bit mode, because IDA automatically recognizes the code as a 32-bit application. At this point, the schema is then transformed by a far shift, and disassembly breaks down and takes control flow to the wrong address (Figure 8).

Figure 8- The static disassembly of 64-bit IDA Pro shows the wrong remote transfer address

In addition to the widely-used IDA, we used another static analysis tool, Radare2, to test our cross-pattern code. Radare2 is an open source command-line framework that supports reverse engineering on multiple platforms. Similar to IDA, Radare2 cannot properly disassemble 64-bit native code in 32-bit mode, and cannot properly remove 32-bit native code in 64-bit mode.

3.2 Dynamic Analysis tools

Dynamic binary staking (DBI) is a widely used technique in runtime code analysis. In this study, we chose two commonly used binary translation tools – DynamoRio and Pin to run the test program.

Neither of these tools can handle the run-time mode conversion from 32-bit to 64-bit, and they crash when the mode conversion is done through a remote migration. Figure 9 shows that neither DynamoRio nor Pin can print the value of the snippet selector in 64-bit mode.

(a)

(b)

(c)

Figure 9- (a) command line (b) DynamoRio (c) Pin execution of cross-mode code

DynamoRio experienced a similar crash when executing the 64-bit application due to the need to switch to 32-bit mode, and Pin simply stopped running with an error message “converting different code segments by far back is not supported.”

Interestingly, when using only compatibility instructions, the DBI tool may sometimes continue to run after running a remote transfer to switch modes, but the mode is not actually switched. As shown in Figure 10, DynamoRio runs the code using compatibility directives, as discussed in 2.1. As you can see, the program runs past the mode exchange point, but instead of actually switching to code segment 0x33, it stops at CS 0x23. The return value in eAX is 4 instead of 8, which is different from the result when executed directly, thus creating an opportunity for DBI environment monitoring.

(a)

(b)

Figure 9- (a) Command line (b) DynamoRio executes code compatible with cross mode

3.3 Debugging Tools

We also found that cross-mode code can cause problems for common debugging tools like OllyDbg and WinDbg when debugging code that includes runtime mode switching. 32-bit OllyDbg (currently 64-bit unavailable) cannot continue debugging after switching from 32-bit to 6-bit mode. 64-bit WinDbg can run cross-mode code correctly, but under cross-mode conditions, the modified code will have a single step problem.

0 x04 summary


Cross-mode encoding can be a very effective obfuscation technique, taking advantage of 64-bit Windows systems’ ability to be backward compatible with 32-bit programs. Currently, most static analysis tools, DBI tools, and debugging tools are based on a single processor pattern, and these tools have problems dealing with code that includes runtime pattern exchange. Also, with a little attention at design time, cross-pattern code can be used to detect DBI environments.

In theory, if these analysis tools can be adapted to dynamic pattern exchange, then the crossover pattern problem can be solved, but requires a lot of engineering effort.

0 x05 thanks


Thanks to Xiaoning Li, Asia-Europe for contributing to this study and review.

0 x06 reference


  • {1}en.wikipedia.org/wiki/WoW64
  • {2}Intel® 64 and IA-32 Architectures Software Developer’s Manual
  • {3}The Performance Cost of Shadow Stacks and Stack Canaries. Thurston H.Y. Dang, Petros Maniatis, David Wagner. ASIACCS 2015
  • {4}Counterfeit Object-oriented Programming: On the Difficulty of Preventing Code Reuse Attacks in C++ Applications. Felix Schuster, Thomas Tendyck, Christopher Liebchen, Lucas Davi, Ahmad-Reza Sadeghi, Thorsten Holz. 36th IEEE Symposium on Security and Privacy (Oakland), May 2015
  • {5}Exploring Control Flow Guard in Windows 10. Jack Tang. Trend Micro Threat Solution Team, 2015
  • {6}ROP is Still Dangerous: Breaking Modern Defenses. Nicholas Carlini and David Wagner. 23rd USENIX Security Symposium, Berkeley 2014
  • {7}Windows 10 Control Flow Guard Internals. MJ0011, POC 2014
  • {8}Write Once, Pwn Anywhere. Yu Yang. Blackhat 2014
  • {9}Hardware-Assisted Fine-Grained Control-Flow Integrity: Towards Efficient Protection of
  • {10}Embedded Systems Against Software Exploitation. Lucas Davi, Patrick Koeberl, and Ahmad-Reza Sadeghi. DAC 2014
  • {11}Transparent ROP Exploit Mitigation Using Indirect Branch Tracing. Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis. Columbia University, 22nd USENIX Security Symposium 2013
  • {12}kBouncer: Efficient and Transparent ROP Mitigation. Vasilis Pappas. Columbia University 2012 Security Breaches as PMU Deviation: Detecting and Identifying Security Attacks Using
  • {13}Performance Counters. Liwei Yuan, Weichao Xing, Haibo Chen, Binyu Zang. APSYS 2011
  • {14}Transparent Runtime Shadow Stack: Protection against malicious return address modifications. Saravanan Sinnadurai, Qin Zhao, and Weng-Fai Wong. 2008
  • {15} Control-flow Integrity Principles, Implementations, and Applications. Martin Abadi, Mihai Budiu, Ulfar Erlingsson, Jay Ligatti. CCS2005
  • {16}IROP — Interesting ROP Gadgets, Xiaoning Li/Nicholas Carlini, Source Boston 2015