This paper mainly analyzes the binary rearrangement solution based on the blog of douyin team and other data (all methods, blocks, functions of Clang pile hook)

Zero, related concept interpretation

  1. Virtual memory:In order to solve the efficiency and security of memory reading, virtual memory is introduced, each process has its own virtual address mapping table, through the mapping table, to find the real memory address.
    • Security issues: Virtual memory for each process is separate, and process data cannot be accessed directly by accessing physical memory
    • Efficiency: By managing virtual memory, the CPU can manage process data by mapping virtual memory to physical memory
  2. Address Space Layout Randomization (ASLR): Adding random Address space offsets to virtual memory headers makes process data more secure
  3. Paging: Like physical memory, virtual memory is managed by paging segments, which in iOS is 16K per page
  4. Page Fault: When a CPU accesses data that is not loaded into memory, the operating system blocks the current thread, loading a new Page into physical memory and matching it with virtual memory. This blocking process is called Page Fault.

First, how to use binary rearrangement startup optimization principle

During App startup, a large amount of data is loaded into memory, which will cause a lot of Page faults. A large number of Page faults will make startup more time-consuming. If we put all the data needed during startup on the same Page for loading, then the startup time can be greatly reduced. The process by which we start putting related symbols on a page is called binary rearrangement

How to detect Page Fault

I can check the number of Page Faults using the App Launch in Instruments

How to perform binary rearrangement and how to view the rearrangement effect

We can tell Xcode to generate Link files at compile time by setting Link Map = YES

Clear the project, then Build, and then find xxxx. TXT in the Path to Link Map File, which is the corresponding Link File. When we open this file we find the Symbols location, which is the current compile order

This order is actually the file order of Compile Source in Build Phases

We write the last symbols in the Order file

Set Xcode Order File to take effect for binary reordering

Then compile again to see the new symbol order

As you can see, the symbols are in the same Order as our Order File. At this point we have implemented the binary rearrangement

For those who fear that apple will reject the use of the Order file, apple does allow the use of the Order file. As you can see in the objc source code, Apple itself uses the Order file libobjc.Order

4. How to find out which symbols are loaded when the App is started?

We know how to do binary rearrangement, so if we find all the symbols that are loaded when the App starts, and we write them in the Order file, we can do what we said in the first section, put the startup symbols on one page, and optimize the App startup. Now let’s talk about what symbols to load when your App starts

  1. Byte’s team article, using thefishhook objc_msgSend, the disadvantage is that this scheme can only hook OC methods, not HOOK C functions, blocks, etc.
  2. We present a way to achieve full coverage through Clang piling. And the Byte team mentioned at the end of the blog that they are trying to achieve 100% coverage with this solution

Through the Clang official website document, we first implement the Clang pile insertion document

  1. Add in the Clang compilation Settings-fsanitize-coverage=trace-pc-guard

  1. At this time, the compilation project will report an error, we need to copy the two C functions provided by the official website to the project
void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) {
  static uint64_t N;  // Counter for the guards.
  if (start == stop || *start) return;  // Initialize only once.
  printf("INIT: %p %p\n", start, stop);
  for (uint32_t *x = start; x < stop; x++)
    *x = ++N;  // Guards should start from 1.
}

void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
  if(! *guard)return;  
  printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
}
Copy the code
  1. When I run the program, I find that these two functions are called and something is printed
  2. printf(“INIT: %p %p\n”, start, stop); The number of symbols (C functions, OC methods, blocks, etc.) can be verified by LLDB (it is better to create an empty project, so the number of symbols will be less).
  3. Printf (“guard: %p %x PC %s\n”, guard,*guard, PcDescr);
  1. We put a breakpoint on main, and through assembly, we see that when main executes, it jumps to__sanitizer_cov_trace_pc_guardFunction.

  1. __sanitizer_cov_trace_pc_guardWhen the execution is complete, it jumps back to main using the B instruction

  1. For each symbol, __sanitizer_cov_trace_pc_guard is called. For each symbol, __sanitizer_cov_trace_pc_guard is called
  2. This method can hook C functions, OC methods, blocks, so I’m not going to try it out here, but you can test your code and see it through breakpoints
  1. since__sanitizer_cov_trace_pc_guardFunction can hook all symbols, so we can get all symbols in this method, here with the help of assembly knowledge and system library to obtain
  1. LR(X30) register: The LR register stores the value of the return address of the function
  2. __builtin_return_address(0); Method gets the return address of the current function, which is the value of the LR register
  3. If B is called from A, __builtin_return_address(0); if B is called from A, __builtin_return_address(0); What you get is his return function, which is function A
  4. Clang is a jump with the (__sanitizer_cov_trace_pc_guard) function added to each symbol, so we get its return function in the (__sanitizer_cov_trace_pc_guard) function, which is all symbols
  5. The system provides dlADDR (PC, &info); Method to find the function symbol with the address value,
  6. Typedef struct dl_info {const char *dli_fname; Void *dli_fbase; // const char *dli_sname; Void *dli_saddr; // Current symbol address} Dl_info;
  1. We modify the code in __sanitizer_cov_trace_pc_guard to print all the symbol names
void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
    // if (! *guard) return; // Load will return here, comment out first
    void *PC = __builtin_return_address(0);
    Dl_info info;
    dladdr(PC, &info);
    printf("guard sname=%s\n",info.dli_sname);
} 
Copy the code

Write our printed symbols to the Order file

Use the new Order file and check Page Fault again. If there is an error in the Order file, it will automatically ignore it. Don’t worry about it

As you can see, our Page Fault(204 -> 152) is significantly reduced. My test project files are not many, so the optimization effect is mediocre. The byte team mentioned an optimization of 15% or more.

Six, summarized

So far, based on the binary rearrangement of App startup optimization is complete, where said wrong welcome to leave a comment. You are also welcome to share the specific data after optimization

My test project is a pure OC project, while the mixed OC and Swift project still needs to do additional Settings and further research

Attached is a link to the basics: iOS Startup Optimization – Basics