LLVM interception optimization

In the last article we talked about the alloc process, the first step is to alloc, but is that actually the case? Let’s verify: Let’s put a break point on alloc and run:

Open assembly:

It turns out that we’re actually going objc_alloc, and why is that? When the SEL of MSG is equal to alloc, we will change the MSG implementation to objc_alloc. When the SEL of MSG is equal to alloc, we will change the MSG implementation to objc_alloc.

Look around to see where fixupMessageRef was called, and it’s inside the _read_images method.

We’ve seen that fixupMessageRef is used when there’s a problem, so why call objc_alloc when there’s no problem with alloc? If it is before readImage, the explanation may have been processed at compile time, so take a look at the LLVM source code. Next, I searched for objc_alloc in the LLVM source code and saw the following comment:

So when this method returns true, alloc will be changed to objc_alloc, allocWithZone:nil will be changed to objc_allocWithZone. Take a look at this method to see when it returns true and see that it uses version.

Looking further down, I found such a piece of code.

Next, look for EmitObjCAlloc.

EmitObjCAlloc(Receiver, cgf.convertType (ResultType)); Method, change the method implementation to objc_alloc. Let’s see when this method is called. Search for tryGenerateSpecializedMessageSend

Found to be in GeneratePossiblySpecializedMessageSend. We now know that apple has some methods like alloc that LLVM handles at compile time, like alloc, because they want to monitor the opening up of memory and so on, so they hook those methods, and when they run alloc, they run objc_alloc, and then they flag the underlying, The alloc method is executed after the tag is completed, and the normal alloc process proceeds.

Three ways to obtain the size of memory

  1. sizeof
  2. class_getInstanceSize
  3. malloc_size

1. sizeof

  • Sizeof is an operator, not a function
  • When we use sizeof to calculate memory size, the main object passed in is the data type, which is determined at compile time rather than run time by the compiler.
  • The final result of sizeof is the sizeof the space occupied by the data type

2. class_getInstanceSize

Is an API provided by Runtime that is used to get the memory size of an instance object of a class and return the specific number of bytes. 8 bytes aligned.

3. malloc_size

This function gets the actual amount of memory allocated by the system. The actual allocated memory is not equal to the actual occupied memory. The memory space allocated by an object <= the memory space allocated by the system. Because the memory space allocated by the object is 8 bytes aligned. This can be verified in the objc source code. The system opens up memory space with 16-byte alignment. In the malloc source code segregated_size_to_fit (), you can see that the alignment is 16 bytes. For an object, its true alignment is 8 byte alignment, 8 byte alignment is enough to meet the needs of the audience, apple system to prevent the fault tolerance, everything is aligned 16 bytes of memory, mainly because of using 8 byte alignment, memory will be close to the two objects, is more compact, and 16 bytes is loose, For apple’s future expansion.

Ii. Memory alignment principle

  1. The alignment rules of data members can be understood as the formula of min(m, n), where M represents the starting position of the current member, and n represents the number of bits required by the current member. If the condition m divisible n (that is, m % n == 0) is met, n is stored from the position m, otherwise continue to check whether m+1 divisible n, until it can be divisible, thus determining the starting position of the current member.

  2. Data member is a structure: when a structure is nested within a structure, the length of the structure as a data member is the memory size of the maximum member of the external structure. For example, if a is nested within b, then b has a length of 8

  3. Finally, the memory size of the structure must be an integer multiple of the maximum memory size of the member in the structure.

Iii. Analysis of structure in vivo

1. 1

struct LGStruct1 { double a;

char b;

int c;

short d;

}struct1;

Structure LGStruct1 memory size calculation

  • Variable A: 8 bytes, starting from 0, min (0, 8), that is, 0-7 store a
  • The variable b: occupies 1 byte, starting at 8, at which point min (8, 1), that is, 8 stores B
  • The variable C: is 4 bytes, starting at 9. At this point, min (9, 4), 9 is not divisible into 4
  • The variable d: is 2 bytes, starting from 16, when min (16, 2), that is, 16-17 stores D

According to memory align rules, the memory sizeof WJStruct1 is 18, but 18 is not an integer multiple of the maximum number of bytes of the variable 8, 18 is round up to 24, mainly because 24 is an integer multiple of 8, so the result of sizeof(struct1) is 24

2

struct LGStruct2 { double a;

int b;

char c;

short d;

}struct2;

Structure LGStruct2 memory size calculation

  • Variable A: 8 bytes, starting from 0, min (0, 8), that is, 0-7 store a
  • Variable b: occupies 4 bytes, starting from 8, at which point min (8, 4), that is, 8-11 stores B
  • The variable C: occupies 1 byte, starting from 12, when min (12, 1), that is, 12 stores C
  • The variable d: is 2 bytes, starting at 13, at which point min (13, 2), which is not divisible by 2, continues to move backward until min (14, 8), starting at 14, which stores C 14-15

According to memory alignment rules, the memory sizeof WJStruct2 is 16,16 is just an integer multiple of 8, so the result of sizeof(struct2) is 16

3

Struct LGStruct3 {double a; int b; char c; short d; int e; struct LGStruct1 str; }struct3;

  • Variable A: 8 bytes, starting from 0, min (0, 8), that is, 0-7 store a
  • Variable b: occupies 4 bytes, starting from 8, at which point min (8, 4), that is, 8-11 stores B
  • The variable C: occupies 1 byte, starting from 12, when min (12, 1), that is, 12 stores C
  • The variable d: is 2 bytes, starting at 13, at which point min (13, 2), which is not divisible by 2, continues to move backward until min (14, 8), starting at 14, which stores C 14-15
  • The variable e: occupies 4 bytes, starting from 16, when min (16, 4), that is, 16-19 stores A
  • Struct member STR: A STR is a struct that starts at an integer multiple of the maximum size of its internal member. In a LGStruct1, the maximum size of its internal member is 8, so STR starts at an integer multiple of 8. 24 is an integer multiple of 8, which is consistent with memory alignment, so STR is stored between 24 and 47

So LGStruct3 needs 48 bytes of memory, and the maximum number of bytes in LGStruct3 is STR, and the maximum number of bytes in LGStruct3 is 8, so the actual memory size of LGStruct3 must be an integer multiple of 8, and 48 is exactly an integer multiple of 8. So sizeof LGStruct3 is going to be 48