The original is from my blog

What’s New

  • This page was last updated 19 November 2020
  • First updated September 13, 2020

preface

This article focuses on the interesting things that happen when an external symbol (a dynamic library function) is called in an iOS, Mac OS X program. Some understanding of the Mach-O format, the concept of address offsets, and the concept of virtual memory are assumed. Refer to this Mach-O format parsing

At the same time, I’ve attached the source code, the compiled binaries, and the MachOView tool, so that you can go through it step by step, so that you can really understand, and trust me, you can’t understand it on paper, and most articles on the web are just on paper, This is also the significance of this article.

#include <stdio.h>
#include <stdlib.h>

const char* str1 = "Hello, World\n";
const char* str2 = "Hello, Boy\n";

static void static_say(a) {
    printf("static hello\n");
}

void say(a) {
    printf("hello\n");
}

int main(int argc, const char * argv[], char **envp, char **apple) {
    // insert code here...
    printf("%s", str1);
    
    char *tiny = malloc(sizeof(int));
    free(tiny);
    
    say();
    static_say();
    
    return 0;
}
Copy the code

The code is relatively simple, mainly to test the printf function. On iOS or MacOS, the printf function is provided by the dynamic library, and you can see the symbolic binding process that dyld does when loading the process. The compiled executables and tools are placed at the end of this article.

Concept 1: STUbs pile

To understand rebase and bind, you must first understand stubs. In the TEXT area (code area), you can see that the compiler has set the symbols of the dynamic library with stubs, each of which takes up 6 bytes (only when x64 is compiled). The stub contains an assembler instruction jumq.When a function is called in code and compiled, the assembly instruction is the stub that calls the function

In the source codeprintf("%s", str1);Add breakpoints on this line, then check the XCode menuDebug->Workflow->Always Show Disassembly, so you can step through the assembly code. Enter si on LLDB to debug step by step as instructed.

As you can see from the figure above, when the symbol printf is called, it jumps to the stub corresponding to printf (The stub corresponds to the mediation), the instruction is gradually executed (si) to enter the address f50.

JMPQ *10ba(%rip) is the stub data FF25(JMPQ pointer) BA10(this is the big endian, translated to the local order 0x10ba). After this instruction is executed, the rip(instruction register) value is f50+6=f56. F56 +10ba = 0x2010, the whole instruction is to jump to the value of the address 0x2010 (this concept is like a pointer, 0x2010 is a memory address), combined with MachOView and debugging assembly code, You can see that the first call to the _printf symbol will jump to the f7c address

Concept 2: Rebase

This is where the power of rebase comes in. In fact, because of the randomization of the process address space, the actual memory virtual address here is not F7c, just for debugging purposes, xcode debug mode runs programs, the process start address is fixed at 0x100000000. That’s 0+4GB(where 4GB is the __PAGEZERO trap area). Normally, due to the existence of spatial randomization, after dyld loads the executable file into memory, it will adjust all symbol addresses pointing to the process. For example, the address f7c is adjusted to XXXX +f7c after the randomization of the process header. As can be seen, Since the address of the symbol _printf cannot be determined until the process is loaded, it is placed in the __DATA section to facilitate modification, which is called rebase.

Concept 3: bind

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * The address of the __stub_helper symbol is stored. After bind, the real address is written to the __la_symbol_prt symbol.

As you can see from the figure above, before bind, 0x2010 was indeed the address f7c, so when debugging stubs, the program did jump to f7c.

Continuing to trace the address, you can see that the f7c address is in the [TEXT, __stub_helper] section, which is the code area. With the MachOView tool, you can see the assembly code directly

Push a constant 0x1A on the stack (this is the offset from the Dynamic Loader Info -> Lazy Binding Info -> Actions) and jump to the address 0xf58. That’s the first row of the figure above. You can see the xcode comments if you debug the assembly directly.

You can see two symbols that Xcode has commented out for us. Dyld_stub_binder is a function in the dynamic library that has been bind to the address. Note that this symbol is a non-delayed binding symbol, which will be looked up and bound when dyld loads the process. This symbol can be found in the Section(__DATA_CONST, __got)

Debugger, directly read 0x100001000 address memory data, you can see that the content is not 0, but the real address 0x7FFF6EF89578

As you can see, these lines go into dyLD_STUB_binder and bind to the _printf symbol from the relevant dynamic library. Note that the _printf symbol is a delayed binding symbol, so after dyLD_STUB_binder is executed, the real address is written to the 0x2010 pointer (at __la_symbol_prt). The next time the _printf symbol is called, The value of the pointer 0x2010 is not f7c, but the actual symbol address.

On the second call to _printf, you can correctly jump to the symbol’s real address in the dynamic library. The LLDB debug instruction memory Read 0x100002010 can be used to read the real address of the symbol stored in it

At this point, the delayed binding symbol has been successfully bind. In order to optimize the speed of startup and the flexibility of dynamic libraries, the system designer invented the simple and clever technology of delayed binding.

Other problems

In this article, there are the following unresolved issues.

  1. How to find the address of a symbol in the dynamic library according to the symbol table of the dynamic library?

How the dyLD_STUB_binder function works

To be continued

Download related tools

MachOView