The two most important days in your life are the day you are born and the day you find out why.– Mark Twain

“Who are you? From where? Where to?” “These three philosophical questions are questions that everyone is constantly answering. How did it happen that our Code, Build, Run, a live App popped up on the screen? How much happens between the user clicking on the App and executing main? Exploring the startup of App can help us understand more about App development itself.

The following figure shows the key nodes of the App startup process:

App Startup Process

Let’s interpret them one by one.

1. Composition of App files

Before going into the startup process in detail, we need to take a look at the iOS/OSX App execution file.

An application usually goes through several steps of “compile -” “link -” “package” to produce an application that can run on a platform. Application files exist in different formats on different platforms, such as exe on Windows, PKG on Android, and ipA next.

IOS grew out of OS X, which was a fusion of NeXTSTEP and Mac OS Classic. So many features of iOS/OS X are derived from NeXTSTEP systems, such as Objective-C, Cocoa, Mach, XCode, etc., as well as the Bundle of applications/libraries. The Bundle’s official statement is that a multicultural structure that holds executable code and the resources used by that code. That is, a standard hierarchy containing executing code and associated resources; It is simply known as a Package.

The bundle structure of OS X apps is slightly different from that of iOS apps. The hierarchical structure of OS X apps is relatively standard, while iOS apps are relatively messy. In addition, unlike OS, only Apple native apps can be placed in the /Applications directory on iOS. Apps purchased from the App Store are installed in the /var/mobile-applications directory; OSX apps are beyond the scope of this article, so let’s take a look at the iOS App Bundle hierarchy:

Xxx. app is our app application, which mainly contains execution file (xxx.app/ XXX, XXX is the name of the application), NIB, pictures and other resource files. Let’s focus on the hero of this section: Mach-o

1.1 the Universal Binary

Most of the time, xxx.app/ XXX files are not in Mach-O format. Since we now need iOS devices that support different CPU architectures, our compiled and packaged executables are ina Universal Binary format. The Universal Binary is simply a package of mach-O files that support different schemas and a Fat Header at the beginning of the file to specify the schema and offset address information supported by the included mach-O file.

The data structure for Fat headers is defined in the Header file:

#define FAT_MAGIC   0xcafebabe
#define FAT_CIGAM  0xbebafeca  /* NXSwapLong(FAT_MAGIC) */
struct fat_header {
    uint32_t    magic;        /* FAT_MAGIC */
    uint32_t    nfat_arch;    /* number of structs that follow */
};

struct fat_arch {
    cpu_type_t  cputype;  /* cpu specifier (int) */
    cpu_subtype_t   cpusubtype;   /* machine specifier (int) */
    uint32_t    offset;       /* file offset to this object file */
    uint32_t    size;     /* size of this object file */
    uint32_t    align;        /* alignment as a power of 2 */
};

Struct fat_header:

1). The magic field is a magic number (like UNIX ELF files) that the loader uses to determine what kind of file it is. The fat binary is 0xcafebabe;

2) the.nfat_arch field indicates how many mach-O files of different architectures are contained in the current fat binary;

Fat_header is followed by fat_arch, which is used to describe the corresponding Mach-O file size, supported CPU architecture, offset address, etc.

You can use the file command to view the information about the execution file, such as Sina Weibo:

Ps: “Most of the time” is because there is also a part where the ipA package would become very large if it supported multiple CPU architectures due to the complexity of the business and the large amount of code, so there is no support for new CPU architectures. Such as QQ and wechat:

Ps :QQ V5.5.1 a single Mach-O file size is 51M

1.2 Mach-O

Although iOS/OS X uses the Darwin Unix-like operating system core and is fully compatible with UNIX standard systems, there is no UNIx-enabled ELF for execution files, instead maintaining a unique binary executable file format: Mach-Object (short for Mach-O). Mach-o is a legacy of NeXTSTEP and its file format is as follows:

From the figure above, we can see that the Mach-O file mainly contains the following three data areas:

(1). Header Header: Defines the data structure of the Mach-o Header in the Header file:

/*
 * The 32-bit mach header appears at the very beginning of the object file for
 * 32-bit architectures.
 */
struct mach_header {
    uint32_t    magic;        /* mach magic number identifier */
    cpu_type_t  cputype;  /* cpu specifier */
    cpu_subtype_t   cpusubtype;   /* machine specifier */
    uint32_t    filetype; /* type of file */
    uint32_t    ncmds;        /* number of load commands */
    uint32_t    sizeofcmds;   /* the size of all the load commands */
    uint32_t    flags;        /* flags */
};

/* Constant for the magic field of the mach_header (32-bit architectures) */
#define    MH_MAGIC    0xfeedface  /* the mach magic number */
#define MH_CIGAM   0xcefaedfe  /* NXSwapInt(MH_MAGIC) */

The above reference code is the 32-bit file header data structure, the header file also defines the 64-bit file header data structure mach_header_64, basically no difference between the two, mach_header_64 has an extra reserved field uint32_t reserved; , this field is not currently in use. Note that 64-bit Mach-O files have a magic value of #define MH_MAGIC_64 0xfeedfacf.

(2). Load Commends:

After mach_header are load commands, which are invoked by the kernel loader or dynamic linker during mach-O file load parsing to instruct how to set up the corresponding binary data segment to load; The data structure of Load Commend is as follows:

struct load_command {
    uint32_t cmd;     /* type of load command */
    uint32_t cmdsize; /* total size of command in bytes */
};

Today, OS X/iOS has over 40 load commands, some of which are used directly by the kernel loader, while others are handled by the dynamic linker. The main Load segments are LC_SEGMENT, LC_LOAD_DYLINKER, LC_UNIXTHREAD, LC_MAIN, etc., which are not detailed here. Simple annotations are provided in the header file and will be covered in the kernel later.

ps: 

  • Otool is a tool for viewing operating Mach-O files, similar to the LDD or readelf tools under UNIX.

  • MachOView is a visual tool for viewing Mach-O files.

(3). Raw segment data

The raw segment data, which is the largest part of the Mach-O file, contains the data required for the Load Command as well as the offset and size of the virtual address; Generally mach-O files have multiple sesegments, and each segment has different functions, generally including:

1).__pageZero: NULL pointer trap segment, mapped to the first page of virtual memory space, used to catch references to NULL Pointers;

2).__text: contains executable code and other read-only data. The data is protected at the following levels: VM_PROT_READ and VM_PROT_EXECUTE to prevent data from being modified in the memory.

3).__data: contains program data, which can be written;

4).__objc: Objective-C runtime support library;

5).__linkedit: symbols and other tables used by the linker

Segment. section: all letters of the SEGMENT are prefixed with two lower dashes, while the SEGMENT is prefixed with two lower dashes. More about common section parsing, see developer.apple.com/library/mac…

2. The Kernel of the Kernel

After understanding the App execution file, we look from the source code, App after what kind of kernel call process, came to the main program entrance main().

2.1 XNU open source code

Although the XNU kernel is open source, it is limited to OS X. The XNU kernel of iOS has always been closed, but historically, iOS is a branch of OS X. The big difference between iOS and OS X is that the target architecture is different (THE target architecture of iOS is ARM, instead of Intel I386 and X86_64 of OS X). Memory management and system security limits; The execution files are all Mach-O. Therefore, this paper assumes that there is not much difference between the two in terms of App startup execution.

The XNU version referred to in this article is V2782.1.97.

2.2 Kernel Invocation process

The kernel flow of the executable is shown below:

The process for starting a process

Quoted from Mac OS X and iOS Internals: To the Apple’s Core P555

The call tree corresponding to the source code is:

Ps: Because the source code is more, the space is limited, only the key code is quoted, and there is a simple comment, I comment with oncenote as the prefix.

// oncenote: /bsd/kern/ker_exec.c line: 2615 execve(proc_t p, struct execve_args *uap, int32_t *retval) { __mac_execve(proc_t p, struct __mac_execve_args *uap, int32_t *retval) {// oncenote: /bsd/kern/ker_exec.c line: 2654 // oncenote: /bsd/kern/ker_exec.c line: Exec_activate_image (struct image_params *imgp) {// oncenote: / BSD /kern/kern_exec.c line: 1328 // Traverse execsw execution format, execute the corresponding ex_imgact function for(I = 0; error == -1 && execsw[i].ex_imgact ! = NULL; I++) {// 1. For Mach-o Binary, execute exec_mach_imgact // 2. For Fat Binary, execute exec_fat_imgact // 3. For Interpreter Script, exec_shell_imgact // Since only the Mach -o execution format is supported, Exec_fat_imgact and exec_shell_imgact will eventually be called exec_mach_imgact // If error code 0 is returned, the Mach file has been loaded and processed correctly. 0 error = (*execsw[I].ex_imgACT)(imgp); // oncenote: For the Mach -o, Execsw [I].ex_imgact (imGP) = exec_mach_imgact(imGP) exec_mach_imgact(struct image_params *imgp) {// oncenote: /bsd/kern/kern_exec.c line: 893 load_machfile(struct image_params *imgp, ...) {// oncenote: /bsd/kern/mach_loader.c line: 287 // oncenote: oncenote: /bsd/kern/mach_loader.c line: 336 // Set memory mapping if (create_map) {vm_map_create(); } // oncenote: / BSD /kern/mach_loader.c line: 373 // set address space layout random number if (! (imgp->ip_flags & IMGPF_DISABLE_ASLR)) { aslr_offset = random(); } // oncenote: /bsd/kern/mach_loader.c line: 392 parse_machfile(struct vnode *vp, ... }} // oncenote: / BSD /kern/kern_exec.c line: 973 if (load_result.unixproc) { /* Set the stack */ //oncenote thread_setuserstack(thread, ap); } // oncenote: /bsd/kern/kern_exec.c line: 1014 /* Set the entrypoint */ thread_setentryPoint (thread, load_result.entry_point); /* Stop profiling */ stopprofclock(p); /* * Reset signal state. */ execsigs(p, thread); . } } } } }

Due to space constraints, this article will not expand the source code. Through the above call tree, the general process of App startup in the kernel has been very clear. If you want to study more deeply, please download the source code and read it with reference materials at the end of the article.

2.3 Loading and parsing the Mach-O file

In the previous section, which described the executable’s execution flow, this section explores how the kernel loads and parses the Mach-O file.

The function load_machfile() loads the Mach-o file and then calls the function parse_machfile() to parse the Mach-o file. The load_machfile() function doesn’t have much logic in itself, so the parse_machfile() function is the core logic for loading parsing Mach-o files. Before reading the code and observing the parsing process, there are three specific logic for parse_machfile() :

First, parse_machfile() is parsed recursively, with initial recursion depth of 0 and maximum recursion depth of 6, preventing infinite recursion. Recursive parsing is mainly used to parse different Mach-O file types according to their dependencies. For example, parsing the Mach-O file of the executable binary file type (MH_EXECUTABLE) requires load_dylinker to handle the loading command LC_LOAD_DYLINKER, while dynamic linkers are also Mach-O files, so they need to recursively parse to different depths.

Secondly, when parse_machfile() is recursively parsing load commands, it will divide the kernel load commands that need parsing into three groups for parsing according to the sequence of loading. In terms of code, it is through three cycles, and each cycle only focuses on the commands that need parsing at the current time: (1) : parsing thread status, UUID and code signature. The related commands are LC_UNIXTHREAD, LC_MAIN, LC_UUID, LC_CODE_SIGNATURE (2) : parse code Segment. The related commands are LC_SEGMENT and LC_SEGMENT_64. (3) : parsing dynamic link library and encryption information. The related commands are LC_ENCRYPTION_INFO, LC_ENCRYPTION_INFO_64, and LC_LOAD_DYLINKER

Finally, about entry points for Mach-O. After parsing the Mach-O file of the executable binary type (let’s say A), we get the entry point for A; But the thread does not immediately enter this entry point. This is because we will also load the dynamic linker(dyLD). In load_dylinker(), dyLD will hold the entry point of A. After recursively calling parse_machfile(), the thread entry point will be set to the entry point of dyLD. After the dynamic linker DYLD finished loading the library, the entry point was set back to the entry point of A, and the program was started.

With this logic in mind, we can explore the parsing process most visually in source code:

// oncenote: oncenote: /bsd/kern/mach_loader.c line: 483 static load_return_t parse_machfile( struct vnode *vp, vm_map_t map, thread_t thread, struct mach_header *header, off_t file_offset, off_t macho_size, int depth, int64_t aslr_offset, int64_t dyld_aslr_offset, Load_result_t *result) {/* * Break infinite recursion */ //oncenote: infinite recursion if (depth > 6) {return(LOAD_FAILURE); } depth++; //oncenote: Different depths parse different mach-o file types, such as executable binary MH_EXECUTE, only at the first depth, Switch (header-> fileType) {case MH_OBJECT: case MH_EXECUTE: case MH_PRELOAD: if (depth! = 1) { return (LOAD_FAILURE); } break; case MH_FVMLIB: case MH_DYLIB: if (depth == 1) { return (LOAD_FAILURE); } break; case MH_DYLINKER: if (depth ! = 2) { return (LOAD_FAILURE); } break; default: return (LOAD_FAILURE); } / /... /* * Map the load commands into kernel memory. */ addr = 0; kl_size = size; kl_addr = kalloc(size); addr = (caddr_t)kl_addr; if (addr == NULL) return(LOAD_NOSPACE); error = vn_rdwr(UIO_READ, vp, addr, size, file_offset, UIO_SYSSPACE, 0, kauth_cred_get(), &resid, p); / /... //nocenote: Start parsing Load commands, /* * Scan through the commands, processing each one as necessary. * We parse in three passes through the headers: * 1: thread state, uuid, code signature * 2: segments * 3: dyld, encryption, check entry point */ for (pass = 1; pass validentry == 0)) { thread_state_initialize(thread); ret = LOAD_FAILURE; break; } /* * Loop through each of the load_commands indicated by the * Mach-O header; if an absurd value is provided, we just * run off the end of the reserved section by incrementing * the offset too far, so we are implicitly fail-safe. */ offset = mach_header_sz; ncmds = header->ncmds; while (ncmds--) { /* * Get a pointer to the command. */ lcp = (struct load_command *)(addr + offset); oldoffset = offset; offset += lcp->cmdsize; switch(lcp->cmd) { case LC_SEGMENT: if (pass ! = 2) //oncenote: break; ret = load_segment(lcp, header->filetype, control, file_offset, macho_size, vp, map, slide, result); break; Case LC_SEGMENT_64: //oncenote: same as command LC_SEGMENT break; case LC_UNIXTHREAD: if (pass ! = 1) break; // onCENote: load_unixthread() calls load_threadstack(), load_threadentry(), and load_threadstate(). Ret = load_unixthread((struct thread_command *) LCP, thread, slide, result); break; case LC_MAIN: if (pass ! = 1) break; if (depth ! = 1) break; //oncenote: Ret = load_main((struct entry_point_command *) LCP, thread, slide, result); break; case LC_LOAD_DYLINKER: if (pass ! = 3) break; // On the first deep recursive call, parse to LC_LOAD_DYLINKER, set DLP, If ((depth == 1) && (DLP == 0)) {DLP = (struct dylinker_command *) LCP; dlarchbits = (header->cputype & CPU_ARCH_MASK); } else { ret = LOAD_FAILURE; } break; Case LC_UUID: //oncenote: omit break; Case LC_CODE_SIGNATURE: //oncenote: omit break; #if CONFIG_CODE_DECRYPTION case LC_ENCRYPTION_INFO: //oncenote: omit case LC_ENCRYPTION_INFO_64: break; #endif default: // Other commands are ignored by the kernel */ ret = LOAD_SUCCESS; break; } if (ret ! = LOAD_SUCCESS) break; } if (ret ! = LOAD_SUCCESS) break; } //oncenote: if (ret == LOAD_SUCCESS) {if ((ret == LOAD_SUCCESS) && (DLP! = 0)) { /* * load the dylinker, and slide it by the independent DYLD ASLR * offset regardless of the PIE-ness of the main binary. */ ret = load_dylinker(dlp, dlarchbits, map, thread, depth, dyld_aslr_offset, result); }} / /... return(ret); }

Load_dylinker ()

static load_return_t load_dylinker( struct dylinker_command *lcp, integer_t archbits, vm_map_t map, thread_t thread, int depth, int64_t slide, load_result_t *result ) { //oncenote: Dyld vnode ret = get_macHO_vnode (name, archbits, header, &file_offset, & MACHO_size, MACHO_data, &vp); if (ret) goto novp_out; *myresult = load_result_null; /* * First try to map dyld in directly. This should work most of * the time since there shouldn't normally be something already * mapped to its address. */ //oncenote: Dyld ret = parse_machfile(vp, map, thread, header, file_offset, macho_size, depth, slide, 0, myresult); / /... If (ret == LOAD_SUCCESS) {result->dynlinker = TRUE; if (ret == LOAD_SUCCESS) {result->dynlinker = TRUE; result->entry_point = myresult->entry_point; result->validentry = myresult->validentry; result->all_image_info_addr = myresult->all_image_info_addr; result->all_image_info_size = myresult->all_image_info_size; if (myresult->platform_binary) { result->csflags |= CS_DYLD_PLATFORM; }} / /... return (ret); }

3. Summary

I had a general idea of the App process before, but I was not clear about the details. It took me more than a month to finish it before the trip. I originally planned to explain the process of dyld loading shared libraries in the third paragraph, but due to the length of this article is too long, so it would be better to write a new article.

There are many details about the App startup process, such as code signature verification, virtual memory mapping, and how SpringBoard, iOS’s touchscreen App loader, switches between apps, which are not covered in this article, but can be further studied if you are interested.