Summary of basic principles of iOS

Dyld profile

  • dyldThe full nameThe dynamic link editor;
  • Is appleDynamic linker;
  • Is an important part of apple’s operating system;
  • After the application has been compiled and packaged into an executable (that is, Mach-O), hand it overdyldTo be responsible for theLink, loader.
  • Dyld runs through the process of App startup, including loading dependent libraries and main programs. If we need to optimize performance and startup, we will inevitably have to deal with DYLD
  • And dyLD is open source, we can download its source code on the official website to read and understand.

Dyld 1.0 (1996-2004)

  • dyld 1Included in theNeXTStep 3.3, used by NeXT before thatStatic binaryThe data. It doesn’t work very well,
  • dyld 1Was written before the system widely used the C++ dynamic library, because C++ has many features, such as its initializer work, that work well in static environments, but can degrade performance in dynamic environments. So a large C++ dynamic library can cause dyld to do a lot of work and slow down
  • In the releaseMacOS 10.0andCheetahBefore, another feature was added, namelyPrebinding pre binding. We can use Prebinding technology for all of the systemsdylibAnd apps to findFixed address. Dyld will load all the contents of these addresses. If the load is successful, all dylib and program binary data will be edited to get all the predicted calculations. The next time you need to put all the data into the same address, you don’t need to do anything extra, which makes it much faster. But this also means that you need to edit these binaries every time you start up, which is not friendly, at least from a security standpoint.

Dyld 2 (2004-2017)

Dyld 2 has gone through several iterations since its release in 2004, and some of the features we see today, such as ASLR, Code Sign, share Cache, etc., were introduced in DyLD 2

Dyld 2.0 (2004-2007)

  • Dyld 2 was released in macOS Tiger in 2004

  • Dyld 2 is a completely rewritten version of dyld 1 that correctly supports C++ initializer semantics while extending the mach-o format and updating dyld. Thus, the efficient C++ library is supported.

  • Dyld 2 has a complete implementation of Dlopen and DLSYM (mainly for dynamically loading libraries and calling functions) and correct semantics, so the older API is deprecated

    • dlopen: Opens a library and gets a handle
    • dlsym: Finds the value of a symbol in an open library
    • dlclose: Closes the handle.
    • dlerror: Returns a string describing the last call to dlopen, DLSYm, or DLclose.
  • Dyld is designed to speed up startup. Therefore, only limited health tests are performed. Mainly because there were fewer malicious programs in the past

  • Dyld also has some security issues, so some features have been improved to improve dyLD’s security on the platform

  • We were able to reduce the Prebinding effort due to the large increase in startup speed. The difference with editing program data is that here we only edit the system library and can do so only during software updates. So during software updates, you might see words like “optimize system performance.” This is Prebinding at update time. Dyld is now used for all optimizations, and its purpose is optimization. So we have dyld two

Dyld 2.x (2007-2017)

  • A number of improvements were made between 2004 and 20017, and dyLD 2 performance significantly improved

  • First, a lot of infrastructure and platforms have been added.

    • Since dyLD 2 was released on PowerPC, it has been addedx86,x86_64,arm,arm64And many derivative platforms.
    • Also launchediOS,tvOSandwatchOS, all of which require new DYLD functionality
  • Increase security in a number of ways

    • increasecodeSigningCode signing,
    • ASLR (Address Space Layout Randomization)Address space configuration random loading: each time the library is loaded, it may be at a different address
    • bound checkingBoundary checking: The Header boundary checking feature was added to the Mach-O file to avoid the injection of malicious binary data
  • Enhanced performance

    • Prebinding can be eliminated withshare cacheShared code substitution

ASLR

  • ASLRIt is a kind of defense against memory damage vulnerability to be usedComputer security technology, ASLR makes use of the function by randomly placing the address space of a process’s critical data area to prevent an attacker from jumping to a specific location in memory
  • Linux has added ASLR in kernel version 2.6.12
  • The Apple inMac OS X Leopard 10.5(October 2007 release) some libraries have been importedRandom address migration, but its implementation does not provide the full protection capability as defined by ASLR. Mac OS X Lion 10.7 provides ASLR support for all applications.
  • The Apple inIOS 4.3To import theASLR.

Bounds checking Bounds checking

  • Significant additions to much of the Mach-o headerThe border checkFunction, thus canAvoid malicious binary data injection

Share Cache Share code

  • Share cache was first introduced in iOS3.1 and macOS Snow Leopard as a complete replacement for Prebinding

  • The share cache is a single file that contains most of the system dylibs, which can be optimized because they are merged into a single file.

    • Readjust allText segment (_TEXT)andData segment (_DATA), and rewrite the entire symbol table to reduce the file size so that only a few regions are mounted in each process. Allows us to package binary data segments, saving a lot of RAM
    • Is essentially aDylib prelinkerThe savings on RAM are significant and can be saved running in normal iOS applications500-1gmemory
    • You can alsoPregenerate data structuresIs used by DYLD and OB-C at run time. You don’t have to do this at startup, which also saves more RAM and time
  • Share Cache is generated locally on macOS and runs dyLD shared code, which greatly optimizes system performance

Dyld 2 workflow

Dyld 2 is purely in-process, that is, executed within the application process, meaning that dyLD 2 can only start executing tasks when the application is started

The following is a diagram of the dyLD 2 workflow

  • 1. The main code of dyLD initialization is dyLDbootstrap ::start, followed by the execution of DYLD ::_main. Dyld ::_main has many codes, which are the core part of DYLD loading.

  • 2. Check and prepare the environment, such as obtaining the binary path, checking the environment configuration, and parsing the image header information of the main binary

  • 3. Instantiate the image Loader of the master binary to verify whether the master binary and DYLD versions match

  • 4. Check whether the share cache has a map. If no, perform the map Share cache operation first

  • Check DYLD_INSERT_LIBRARIES and load the inserted dynamic libraries (e.g. image loader).

  • 6. To perform link operation, all dynamic libraries of dependencies will be recursively loaded first (the dependent libraries will be sorted, and the dependent libraries will always be in front), and symbol binding, rebase and binding operations will be performed at this stage.

  • OC’s +load and C’s constructor methods both execute at this stage;

  • 8. Read the LC_MAIN section of Mach-O to obtain the entry address of the program and call the main function

Simplified version

  • (1) Parsing mach-O files to find its dependent libraries and recursively finding all dependent libraries to form a dependency map of dynamic libraries. Most apps on iOS rely on hundreds of dynamic link libraries (most of which are the system’s dynamic libraries), so this step involves a lot of work.
  • ② Match the Mach-O file to its own address space
  • Perform symbol lookups
  • (4)rebaseandbinding: Because the app needs the address space configuration to load randomly, all Pointers need a base address
  • ⑤ Run the initialization program, and then runmain()function

Dyld 3 (2017-present)

  • dyld 3Dynamic linker is a new dynamic linker released by WWDC in 2017. It completely changes the concept of dynamic linking and will become the default setting for most macOS applications. Dyld 3 will be used by default for all Apple OS applications in 2017.
  • dyld 3The first was in 2017iOS 11Is mainly used to optimize the system library.
  • And in theiOS 13In the system, iOS fully adopts the new DYLD 3 to replace the previous DYLD 2, becauseDyld 3 is fully compatible with DYLD 2The API interface is the same, so, in most cases, developers do not need to do additional adaptation to smooth the transition.

Why is DYLD 2 redesigned to form a new DYLD 3?

The redesign of DYLD was mainly considered from the following aspects

  • performance: Want to improve as much as possiblestartup
  • security: Security features were added in DYLD 2, but it was difficult to keep up with reality, and despite a lot of work, it was difficult to achieve this goal
  • reliabilityandtestability: Apple has released a lot of nice testing frameworks for this, for exampleXCTest, but these test frameworks depend onDynamic linkertheThe underlyingThe test framework’s library is then inserted into the process, so it cannot be used to test existing DYLD code and is difficult to test security and performance levels

How to improve and optimize DYLD 2 to DYLD 3?

Suggestions for improvement and optimization

From the above dyLD 2 workflow, we know the execution process of DYLD 2, which can be improved and optimized from the following two aspects:

  • Identify the security-sensitive parts

    • Parse mach-o headersParsing the Mach – o andFind dependenciesFinding dependent libraries is a sensitive part of security, which is one of the biggest hidden dangers.
    • Malicious from changeThe Mach - o the head, can carry out certain attacks;
    • If the App is used@rpaths 即The search path, can be accessed throughMalicious path modificationorSome libraries are inserted into specific locationsFor the purpose of breaking the program;
  • Identify the parts that are heavy resource hogs (the cacheable parts)

    • Perform symbol lookupsSymbol lookup is one of them, because symbols in a particular library will always be at the same offset in the library unless there is a software update or library change on disk (i.eThe sign offset is fixed);

Dyld 2 improvements and optimizations

Here are some of the changes dyLD 2 made to DyLD 3, mainly moving security-sensitive and resource-heavy parts to the upper layer, then writing a closure to disk for caching, and then using the closure in the program process. The following is the illustration

Dyld 3 components/workflow

The dyLD 3 workflow is divided into three parts, as shown below

Part 1: Out-of-process: Mach-o Parser

Out-of-process Mach-O parsers and compilers are common background programs used to improve the performance of the test infrastructure.

The first part mainly does the following work outside the App process:

  • Resolves all search paths@rpathEnvironment variables, because they affect startup speed
  • Analysis of themach-oBinary data
  • performSymbol lookup
  • Create from these resultslaunch clourse

Part TWO: In-process: Engine

The in-process engine, which is resident in memory, can start applications in DYLD 3 without analyzing mach-O headers or performing symbol lookups, because analyzing Mach-O and performing symbol lookups are time-consuming operations, so the program startup speed is greatly improved.

The second part mainly does the following work in the App process:

  • checklaunch closureWhether it is right
  • Map todylibIn, jump againmainfunction

Launch closure: Cache

Launch closure caching service. Most of these programs start with a cache without calling an out-of-process Mach-O profiler and compiler. And launch Closure is much simpler than Mach-O because launch Closure is a memory-mapped file that doesn’t need to be analyzed in a complex way and can be easily validated for speed

  • System-appliedlaunch closureJoin directly toShared cache Share cache
  • For third-party applications, we will build during application installation or updatelaunch closureBecause at this timesystem libraryChanged
  • By default, iniOS.tvOSandwatchOSOn, these operations will all be run for you beforepre-built.
  • inmacOSBecause the application can be side loadedThe App StoreInstalled applications), so if needed,in-process engineRPC(Remote Procedure Call) toout to the daemonThen, it can use the cached closure.

So all in all, dyLD 3 prehandles a lot of time-consuming lookup, computation, and I/O operations, leading to a big jump in startup speed. Dyld 3 takes care of many time-consuming operations ahead of time, greatly improving startup speed.

Launch closure

This is a newly introduced concept that refers to all the information an app needs during startup. For example, what dynamic link libraries the app uses, the offsets of each symbol, where the code signature is, and so on.

Dyld 3 symbol missing problem

  • Dyld 2 uses lazy symbol loading by default

  • In DYLD 3, the result of symbol resolution is already in the launch closure before the app starts, so lazy symbols are no longer needed.

  • If at this point, if there is a sign missing, dyLD 2 and DYLD 3 behave differently

    • dyld 2When the missing symbol is first calledThe App will crash
    • indyld 3In, missing symbols can causeThe App will crash as soon as it starts

conclusion

  • Dyld 2 workflow

    • parsingThe Mach - o the head
    • Finding dependent libraries
    • Mapping the Mach - oFile into the address space
    • Perform symbol lookup
    • useASLRforrebaseandbindThe binding
    • Run all initializers
    • Execute main
  • Dyld 3 workflow

    • Out-of-process: Move mach-O header parsing and symbol lookups from DYLD 2 to out-of-process execution and put the results inStart the closureTo the disk
    • In-process: validationStart closure correctnessAnd map dylib, execute main function
    • Start the closure cache service

Refer to the link

  • Dyld 3 improvements and optimizations in iOS 13
  • Static linking vs Dyld 3 for iOS startup optimization
  • IOS Dyld has lived and died
  • IOS startup time with Dyld3