Preface:

So, we’ve seen what’s going on in the class structure, and we started directly with objc_init, and this is going to be a new chapter, and this is going to be about what does the system do for us when we start our app? How does the system “run” our code? Involved in the study of the past have not been to, also do not need to use in the development, but we as a programmer is itself, in terms of ethics should be to understand, or both amateur and expert asks, you know, is developed for many years, however, is really a bit ridiculous, based on this part is also more obscure, I watched two times didn’t fully understand, even for the main process line or not too smooth, still need to look more to do more to explore more hands-on to learn ~ ~ so I spirit of fun, and help you understand the point of view, to more carefully combing the content, this blog before comparison will strive for more details.

Simple knowledge arrangement

About the library

Most of our code depends on the underlying libraries during development, so we have to load the library files into memory before loading. Library is divided into dynamic library and static library, the difference between the two is the link is different. When we create an app, if it is not compiled, then under the products file, the.app file will be red. This file will not exist until we complete the compilation. Click on the file location to display the package contents. In fact, we have successfully decompiled this executable file into assembler code.

Static libraries are loaded one by one, while dynamic libraries are not loaded directly. They are shared to optimize memory space and reduce the size of packages.

This executable file can be printed directly, but unfortunately, if the emulator or real machine, directly put into the terminal will report an error, in fact, in the source code is also determined, may be to do the path processing, here has not found a solution (mainly is not the time to do too much)

So we’re just going to try it out on a MAC environment for a while, just to take a look, and see if we can actually export KC short, soft, and so on.

About dyld

Introduction to the

So how do these dynamic libraries get loaded into memory? This is related to apple a very important link – dyld!! Both the static and dynamic libraries depend on this linker!

From a macro point of view, dyld is to load libSystem, register notify callback, load all image files, execute map_images and load_images separately, and finally call main. Science: The image here is the image file, which is the library file. All the library files are stored in the corresponding file of our MAC system. However, during the development process, the system will load the required libraries into memory, instead of loading all the libraries, which is reasonable. So how do we look at how many libraries we have loaded and where the libraries are located? Simply print the image list after entering the main function.

I originally thought it was the path of this program. In fact, this Corefundation is under the library path under the lib of xcode’s corresponding architecture. I guess it should be the copy path.

Looking for

Even if it does not ld_start, it does not ld_start. If it does not ld_start, it does not ld_start. If it does not ld_start, it does not ld_start.

Dyld underlying system library source code

This download is the latest version of 852. The source code cannot be run directly because it relies on more underlying libraries. Calldyldbootstrap: : calldyldbootstrap: : Start, that’s c++ like calldyldbootstrap, call start, blah, blah, blah, whatever that’s not the point, then search for calldyldbootstrap, find start, go straight down, and you’ll see that it returns to _main, This is all in line with the above flow chart, not to say.

About the _main function

First of all, the _main function is not the original main function in our project. It is the main function inside dyld. The first thousand lines of code are all about preparation, including architecture decisions, environment, library file paths, and so on. Here I only look at the macro perspective, too many details to pay attention to, the specific module of each just need to read the official notes ~~~

By the way, there are system-level methods to register with the shared cache.

Train of thought transformation

When there is too much code, and most of the early code is preparation code and not related to the main process, we can look from the back, that is, follow the results to find the logic – backward method.

All result returns withsMainExecutableThis method is relevant.The sMainExecutable method does some system library binding and so on. So you can try to continue the positioning.

Find the place where sMainExecutable is defined

Find out where the assignment came from, so the first MH here is a format, called MH, where you can drag the executable directly into the rotten apple, and it shows you this table by table, and this is reading the table, and there’s no recursion.

After sMainExecutable, insert the dynamic library and link the main program.

After all preparations are completed, run is executed.

initializeMainExecutable

Now that you have all the data you want, how do you run it? How to deal with it? So once you get all the mirrors, iterate through them and initialize them, you can see that every iteration is going to be runInitializers. So what does this method do?

RunInitializers are invoked every timeprocessInitializersIn this case, you can see that this method is the core method from passing parameters.

ProcessInitializers can be set each timerecursiveInitialization, the parameters are basically the same.

recursiveInitialization

— Initialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization

There are two calls to notifySingle. As you can see from the name, this is a single method to register notifications. As you can see from the comment, the first is a notification to register a dependency file, and the necessary dependency files must first be loaded.

Question supplement 1

Why did I write it this way?

First of all, if the system library is coming in, then it has no dependencies, go to the following doinit method, if there are dependencies coming in, then when the dependencies come in, it will be initialized, and then call notify.

Question supplement 2

What is the order in which c++ methods load and main are called? If we write a C ++ method in the main method of the main program, why does the C + method load before c++? Print breakpoint BT information, clearly initiator is doini method. Load -> C ++->main ===== the c+ method must be written in the main program, not in the system file. After we enter recursiveInitialization, we get the data callback and then we call the load method. The load method is called in notify, so it must be the c++ of the system that calls it first, then load, then main.

Let’s look at the implementation of the notify method, where there is a pointer function call, and even though there is an unload Images method underneath, it says in the comment that it’s unload Images, so let’s not look at that.

Search this pointer function globally.

_dyld_objc_notify_initAs the second parameter isregisterObjCNotifiersMethod calls.

RegisterObjCNotifiers Method call point.

Ld_objc_notify_register = _dyC_notify_register = _dyC_notify_register = _dyC_notify_register = _dyC_notify_register

Program load!

One step away from the light

Although the callbacks on both sides have been connected, the middle process is still unclear. How to find the middle relationship more clearly? At this point, you can look for ideas from the known objc_init method in libobjc. Dylib.

methods

The method here is very important, since we already know it’s going to go to the objc_init method in libobjc. Dylib, why don’t we just put a break point here and see how the previous process goes

The specific method is to point in, see him this method is which open source library, step by step verification. libobjc.dylib objc_init <- libdispatch.dylib _os_object_init: <- libdispatch.dyliblibdispatch_init: <- libSystem.B.dylib libSystem_initializer: <- dyld ImageLoaderMachO::doModInitFunctions:

If you look at this ld_objC_notify_register globally, it doesn’t work. If you look at this ld_objC_notify_register globally, it doesn’t work.

Search globally for doModInitFunctions

First loadslibSystemAny library depends on this.Determine whether libSystem is loaded before executing other operations.

Global searchdoModInitFunctionsWhere it’s called,doInitializationThis method

— Initialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization — recursiveInitialization Everything seemed to connect.

In series

That leaves the question, in the recursiveInitialization method, what is the relationship between the notify and doInitialization methods? How do they connect?

finishing

It’s worth sorting out here what we’ve been talking about. When you enter the initialization method doInitialization in the recursiveInitialization method, you must go through the previous procedure and jump to the objc_init method initialization. This method calls _dyLD_OBJC_NOTIFy_register with three arguments (&map_images, load_images, unmap_image); Map_images contains categories, protocols, class methods, and so on.

Ld_objc_notify_register = map_images = map_images = map_images = map_images = map_images = map_images = map_images = map_images = map_images = map_images = map_images = map_images To process the image file that you have initialized. Because in the process of image loading, the loading of each image is different, the internal processing is implemented by the image itself, and the main process is left alone.

dyld – > objc ??

What do you mean? The twins? Yes, they mean the same thing, they just have different names.

Upon returning from the LibobJC library, the _dyLD_OBJC_NOTIFy_register method is called and assigned to search registerObjCNotifiers globally

The registerObjCNotifiers method is turned on.

Call the sNotifyObjCMapped function

From dyld – > main

Jump right over… 1. Look at dyld assembly code

2. Print through registers with assembly before main.

WWDC2017 brief overview of DyLD

1. Apple is committed to faster startup of app, reducing startup time, reducing dyLD link code, reducing library references, reducing initialization code, etc. In addition, Swift is recommended, because there is no initializer, there is no unaligned initialization data structure, etc., so Swift may be the mainstream of ios development in the future.

2. Dyld history, the first generation started in 1996 based on Unix, not very mature, seems to be written in binary, the boundary control is not perfect, it takes more time and cost more performance, the packaging was done before the c++ dynamic library, the first generation also added the pre-binding count, which is basically to find the address in advance for the next load. But the security is poor.

3. The second generation is a generation of regeneration version, that is to rewrite, more efficient support c++ library, extended macho format, reduce the pre-binding operation, thus further improve the speed, more architecture support dyld2, security performance improvement is, every time the loading of the library address is randomly allocated; The performance improvement comes from removing pre-binding in favor of shared code (pre-generating some data structures for dyld to use).

4. Dyld3 began to be promoted in 2017, and after the large-scale use gradually replaced most of dyLD2, and the development of dyLD3 dynamic linker is still to enhance performance and improve security.

5. Contrast. Improved design. Start analysis of DYLD2: By mach0 find all the files first, and then recursively find rely on libraries, until all the dyld need structure, and then map all file into the address space, then the symbol found, through the system to find the system library method, and then found his address is copied to the application function pointer, and then to binding, random pointer address, Finally, run the initializer. And then we go to main. Dyld3 “: no need to analyze mach0 files or perform symbol lookups. more security analysis is in the shared cache.

There is also some structure alignment and so on. Bala Bala is too professional to understand. I feel that in our development, we rarely care about the performance problem, and in the bottom development, constantly improve the efficiency of the operation, may also be because of this reason, our development can be profligate, do not feel the efficiency problem. Speak of very cattle batch, however half seem to be to cast pearls before swine, expect WWDC to do a training class, when the time comes beyond LG, I must be the first to sign up!

Postscript:

1. Finally, after looking at the second time, a little bit of groping clear of the general process, the main line of the process is probably from dyld start to the final callback notify callback to complete the entire path, there are several libraries in the middle of the call back and forth, the most important thing is to clear the recursive notify principle, In fact, it is to distinguish system libraries and other libraries, do dependencies. The first time to listen to or in the fog, mainly because of the first, originally will not, plus their own did not explore the code, when looking back and forth will be more dizzy second, especially the bottom of the computer foundation is relatively weak. Fortunately, when you look back, you can cooperate with the video to explain + source yourself a little bit to view. Thank you KC teacher for your patience!

2. When copying the jump library, I finally found out how the text in the nugget changed color… How many times have you tried markdown’s syntax? !

3.KC DXR