The application loading process

Unpacking an app is a must for many reverse enthusiasts. For security reasons, apple encrypts all apps that go to the appstore, so if you reverse an app downloaded from the appstore, the “source code” you can see is very obscure. In order to be able to see the “source code” of an application, it is necessary to decrypt the application, known as unmasking. The purpose is to analyze some of the technical implementation principles of the application, or exploit some vulnerabilities for attack and test.

This article is not intended to be a tutorial on how to use a tool for shucking, but rather a simple analysis of how these commonly used shucking tools work. To understand how unmasking works, you need to understand how an encrypted application works. The following image shows a simple process of loading and running a shell application:

Peeling principle and common tools

To unshell a shell application, nothing more than the use of static unshell and dynamic unshell two methods: static unshell is to have mastered and understood the encryption algorithm and logic of the shell application in the premise of not running the shell application decryption processing. The method of static unshell is difficult, and the encryption side may switch to more advanced and complex encryption technology after the application is cracked; Dynamic decoupling is to start with the executable program image running in the process memory space and dump the contents in memory to realize the decoupling process. This approach is relatively simple to implement and you don’t have to care what encryption technology is used. As can be seen from the above shell application running process, no matter how the shell program is encrypted, the code image in the process after the final run is always the decrypted binary of the original program. So as long as the code image in the memory space of a process can be read and accessed, dynamic unpacking can be realized. The following two tools subtly use two different access techniques to implement dynamic unmasking.

Dumpdecrypted /frida-ios-dump is implemented using dynamic library injection

Dumpdecrypted and Frida-ios-dump are both open source projects on Github and can be downloaded at github.com/stefanesser… There is a great deal of documentation on using these two tools for shucking. We know that an application, in addition to having an executable program, also links to many dynamic libraries. After loading, the dynamic library shares the same process memory space with the executable program, and the code in the dynamic library can access the authorized area in the whole process memory space, including the image of the executable program is loaded into the memory area in the process. Therefore, as long as the application program tries to load a specific third-party dynamic library, that is, let the third-party dynamic library be injected into the process of the application program, the image information in the process memory of the decrypted executable program can be dumped to the file so as to realize the unshell processing. For a jailbroken device, there are two main methods to implement third-party dynamic library injection:

  1. Set the value of the environment variable DYLD_INSERT_LIBRARIES to point to the path of this third-party dynamic library. Then run the application that you want to shell. The setting of the DYLD_INSERT_LIBRARIES environment variable is a feature provided by the operating system, and all running programs will load the dynamic library file pointed to in the environment variable.

  2. To a third party the dynamic Library file stored in the jailbreak equipment/Library/MobileSubstrate/DynamicLibraries / * * * * directory and write the corresponding Library file of the same name, All executables specified in the PLIST will load the corresponding dynamic library as soon as they run (the directory where the Tweak plug-in resides).

There is also a way to implement dynamic library injection by directly modifying the executable contents corresponding to the Mach-O format.

After solving the problem of dynamic library loading, we need to solve the problem of the timing of dynamic library code running. There are four ways to get a loaded dynamic library to automatically run a piece of code after loading:

  1. Create a C++ global object and add specific code to the constructor of the class to which the object belongs.

  2. Create an OC class and add specific code to the OC class’s +load method.

  3. Specify an init entry function when building a dynamic library, and add specific code to the entry function.

  4. Define a function with a _attribute_((constructor)) declaration in the dynamic library and add specific code to the function.

If you want to learn more about how these methods work, please refer to my article: Deconstructing global objects and initialization functions in iOS

The dumpDecrypted tool does this by creating a dynamic library called DumpDecrypted. Dylib and defining one inside the library

__attribute__((constructor))
void dumptofile(int argc, const char **argv, const char **envp, const char **apple, struct ProgramVars *pvars)
Copy the code

Function to achieve the shell. The general implementation of this function is described below.

Second, using the parent-child process relationship to implement the decoupling Clutch

Clutch is also an open source project on Github at github.com/KJCracks/Cl… There are plenty of tutorials on how to use the tool. In Unix, the parent process can be fork or posix_spawNP to run or create a child process. Both functions return the corresponding child process ID(PID). In iOS, the Task_for_PID function can be used to obtain the Mach port identifier of a process in the Mach kernel subsystem from the process ID. Once you have the Mach port id, you can use the mach_vm_read_overwrite function to read what is stored in any virtual memory region in the specified process space. So the implementation inside Clutch is that the program Clutch runs the posix_spawnp function on the path of the program file that is about to be unpacked and becomes its child, Then, task_for_PID and mach_VM_read_overwrite functions are used to read the memory space mapped by the image of the executable program that has been decrypted in memory to achieve the purpose of decoupling.

One consideration: it may not be necessary to have a parent-child relationship in practice. Is it possible that a privileged program or program running on root can use the API provided by the Mach subsystem to read information in the memory space of other processes as long as the program has the PID of the corresponding process?

Both methods, dumpdecrypted and Clutch, end up writing an in-memory map of the decrypted executable’s image into a file that holds the unhulled contents. If you look at the dumptofile implementation in DumpDecrypted and the Dumpers directory at Clutch, you can see that the structure of the contents mapped in memory by an executable program image is basically the same as the structure of the Executable file in Mach o format. It’s all made up of a mach_header and a bunch of load_command structures. Therefore, the so-called dump process is to write these structures and data in memory to the file intact, which completes the core part of the unpacking. If you want to take a closer look at the implementation of this section of code, it is recommended to take a look at the composition of the Mach-O file format.

Afterword.

When you look at these internal implementations, you may find that the principle is quite simple. And chances are you’ll be able to do it soon. But the question is, why do these ideas always come to others when we don’t? Is it related to the way Chinese people think and solve problems? In our education and practice system, there is more utilitarianism and pragmatism, and often few people will conduct in-depth exploration and research on problems and think about the relevance of problems. Hopefully this will change in the future, especially as a programmer, with a strong desire to explore and learn rather than simply copy and apply.

Finally, I’d like to thank Peiqing Liu, author of iOS Application Reverse engineering and Security. I wrote this article after consulting him on some knowledge related to reverse. The book is recommended for lovers of reverse.