Here’s an article from Noah Martin, who found an unmentioned change in WWDC2021 that will help your App run faster on iOS15.

The most intriguing features of WWDC21 are buried deep in the Xcode 13 release notes.

All programs and Dylibs built for macOS 12 or iOS 15 and higher deployment goals now use the new chained Fixups format. This new format uses different load commands and LINKEDIT data, and does not run or load on older versions of the operating system.

There aren’t any documents or meetings to learn about the change, but we can reverse engineer it to see what Apple is doing differently with the new OPERATING system and whether it will optimize your apps.

First, a little background on the program that controls application startup, dyLD.

Know dyld

The dynamic linker (DYLD) is the entry point for each application. It’s responsible for getting your code ready to run, so any improvements to Dyld will improve the startup time of your application. Dyld does the FixUPS work before calling main, running the static initialization method, or setting up the Objective-C runtime. This includes the rebase and bind operations, which modify Pointers in the application binaries to ensure that all addresses are valid at run time. To see these operations, you can use the dyldinfo command line tool.

% xcrun dyldinfo -rebase -bind Snapchat.app/Snapchat
rebase information (from compressed dyld info):
segment section          address     type
__DATA  __got            0x10748C0C8  pointer
...
bind information:
segment section address     type    addend dylib        symbol
__DATA  __const 0x107595A70 pointer 0      libswiftCore _$sSHMp
Copy the code

The result above means that address 0x10748C0C8 is in __DATA/ __GOT and needs to be shifted by a constant value (called a slide). While the address 0x107595A70 is in __DATA/__const, it should point to a Hashable[1] protocol descriptor in libSwiftcore. dylib

Dyld uses the LC_DYLD_INFO load command and the dyLD_INFO_command structure to determine the position and size of the rebase, bind, and export symbols [2] in the binary. Emerge (disclaimer: Emerge is a program that looks at the binary size and distribution of your App package (😬)) analyzes this data, gives you an intuitive idea of how much they contribute to the binary size, and offers suggestions on how to use linker flags to make it smaller.

A new format

When I first imported an app built for iOS 15 into Emerge, the software showed no visual Dyld Fixups. This is because the LC_DYLD_INFO_ONLY loading command is missing and is replaced by LC_DYLD_CHAINED_FIXUPS and LC_DYLD_EXPORTS_TRIE.

% otool -l iOS14Example.app/iOS14Example | grep LC_DYLD cmd LC_DYLD_INFO_ONLY % otool -l iOS15Example.app/iOS15Example |  grep LC_DYLD cmd LC_DYLD_CHAINED_FIXUPS cmd LC_DYLD_EXPORTS_TRIECopy the code

The output data is exactly the same as before, a three-part structure, with each node representing part of a symbolic name.

The only change in iOS 15 is that data is now made bylinkedit_data_commandReference, which contains the offset of the first node. To verify this, I wrote a short Swift app that parses the binaries of iOS 15 and prints each symbol.

let bytes = (try! Data(contentsOf: url) as NSData).bytes
bytes.processLoadComands { load_command, pointer in
  if load_command.cmd = = LC_DYLD_EXPORTS_TRIE {
    let dataCommand = pointer.load(as: linkedit_data_command.self)
    bytes.advanced(by: Int(dataCommand.dataoff)).readExportTrie()
  }
}

extension UnsafeRawPointer {
  func readExportTrie(a) {
    var frontier = readNode(name: "")
    guard !frontier.isEmpty else { return }

    repeat {
      let (prefix, offset) = frontier.removeFirst()
      let children = advanced(by: Int(offset)).readNode(name: prefix)
      for (suffix, offset) in children {
        frontier.append((prefix + suffix, offset))
      }
    } while !frontier.isEmpty
  }

  // Returns an array of child nodes and their offset
  func readNode(name: String)- > [(String.UInt)] {
    guard load(as: UInt8.self) = = 0 else {
      // This is a terminal node
      print("symbol name \(name)")
      return[]}let numberOfBranches = UInt(advanced(by: 1).load(as: UInt8.self))
    var mutablePointer = self.advanced(by: 2)
    var result = [(String.UInt)] ()for _ in 0..<numberOfBranches {
      result.append(
        (mutablePointer.readNullTerminatedString(),
         mutablePointer.readULEB()))
    }
    return result
  }
}
Copy the code

The chain

The real change is in LC_DYLD_CHAINED_FIXUPS. Prior to iOS 15, rebase, bind, and lazy bind were each stored in a separate table. They are now combined into chains, and the starting point of the chain is included in this new load command.

The application binary is divided into several sections, each containing a series of Fixups that can be bind or Rebase (no longer available at the Lazy Adaptors). Each 64-bit rebase[3] position in the binary now encodes the offset it points to as well as the offset of the next CCC, as seen in this structure.

struct dyld_chained_ptr_64_rebase
{
uint64_t    target    : 36,
            high8     :  8,
            reserved  :  7.// 0s
            next      : 12,
            bind      :  1;    // Always 0 for a rebase
};
Copy the code

36 bits are used for the pointer target, which is enough to satisfy 2³⁶ = 64GB binary, and 12 bits are used to provide the offset (stride = 4) for the next fixup. So it can point anywhere within 2¹² * 4 = 16KB — exactly the size of a page on iOS.

This very compact encoding means that the entire process of traversing the chain can cover the entire binary. In my tests, more than 50% of the DyLD was able to be optimized by the new formatting system and ultimately reduce the size of the binary package, with only a small amount of metadata left over to bootstrap the first fixup of each page. The end result is a reduction in the size of large Swift applications by more than 1MB.

The source code for this process is in MachOLoaded. CPP, and the binary layout is in /usr/include/machio-o/fixup-model.h.

Order problem

To understand the motivation behind the new Fixup format, we must look at one of the most expensive operations during application startup: Page Fault. When an application accesses code in the file system during startup, it needs to be brought into memory from the disk file via a Page fault. Each 16KB range in the application binary is mapped to a page in memory. Once a page has been modified, it needs to stay in RAM (called a dirty Page) as long as the application is not stopped. IOS optimizes dirty pages by compressing pages that have not been used recently.

The fixup at application startup requires a change of address in the application binary, so the entire page is inevitably marked as dirty. Let’s see how many pages are fixups when the application is launched.

% xcrun dyldinfo -rebase Snapchat.app/Snapchat > rebases
% ruby -e 'puts IO.read("rebases").split("\n").drop(2).map { |a| a.split(" ")[2].to_i(16) / 16384 }.uniq.count'
1554
% xcrun dyldinfo -bind Snapchat.app/Snapchat > binds
450
Copy the code

When using table structures to store fixUp data, rebase needs to be addressed first, followed by bind. This means that rebase requires a lot of Page faults and ends up being mostly IO bindings [4]. On the other hand, the page that BIND accesses is 30% of the page used by Rebase.

Now, in iOS 15, chained Fixups combine all the changes for each memory page. Dyld can now process them much faster and can do both rebase and bind with just one memory adjustment. This allows operating system features such as memory compressors to take advantage of the information in chained Fixups without having to go back and unzip the old page during the bind process. As a result of these changes, the rebase feature in dyld becomes a useless feature.Overall, this change mainly affects the reverse engineering of iOS applications and explores the details of the dynamic linker, but it’s a good reminder of the low-level memory management that affects your application’s performance. While this change only takes effect when you’re targeting iOS 15, keep in mind that there’s still a lot you can do to optimize your app’s startup time.

  • Reduce the number of dynamic frameworks
  • Reduce the size of the application to reduce the use of memory pages (that’s why the authors make Emerge!) .
  • The code from+loadAnd the static initializer
  • Use fewer classes
  • Defer the work until after the first frame is drawn

[1] The symbol for dyldinfo has been tamper with, you can use xcrun swift-demangle ‘_$sSHMp’ to get a human-readable name.

[2] Export is the second part of bind. A binary is bound to symbols exported from its dependencies.

[3] The same is true of bind. A pointer is actually a combination of rebase and bind (dyLD_chained_ptr_64_bind), with a single bit to distinguish the two. Bind also needs to import symbol names, which are not discussed here.

[4] asciiwwdc.com/2016/sessio…

How iOS 15 makes your app launch faster