There is internal alignment of structures

Hey hey, steal a picture, first above (from the great god Cooci) :

Why explore structural in-vivo alignment in the first place? Because see objc source will find that everything has its object is the basis of the structure, when we create an object, do not need to pay attention to the sequence attribute, because the system will automatically help us deal with, but when we create structure requires us to analysis, thought the system does not automatically give us optimization, first look at the following two structures:

struct LGStruct1 {
    double a;
    char b;
    int c;
    short d;
}struct1;

struct LGStruct2 {
    double a;
    int b;
    char c;
    short d;
}struct2;
Copy the code

First we declare the above two structs, and the data types that the two structs have are the same as each other. Then we look at the following code:

printf("%lu=========%lu\n",sizeof(struct1),sizeof(struct2));
Copy the code

As we normally say, double takes 8 bytes, char takes one byte, int takes 4 bytes, and short takes 2 bytes. After the byte alignment operation, the space size of the structure should be calculated to be 16 bytes, and the memory alignment should be 16 bytes, so the system should output 16, but when we run the program, we find that it is not, here is the output result:

24 = = = = = = = = = 16Copy the code

So the question is, why do the same data types have the same number of properties, but they take up different amounts of space? This is the focus of today, the principle of alignment within structures, which follows the following rules:

1. Data member alignment rules: For data members of a structure or a union, the first data member is placed at offset 0, and each subsequent data member is stored starting with an integer multiple of the size of that member or its children (as long as that member has children, such as arrays, structures, etc.) (for example, int is 4 bytes, Starts with an integer multiple of 4). In layman’s terms, this is an integer multiple of the member (or child member, if any) in which the starting position of each member except the first position is stored.

2. Struct as members: If a struct has some struct members in it, the struct members are stored starting at an address that is an integer multiple of the largest internal element size.

The total sizeof a structure, the result of sizeof, must be an integer multiple of the largest member in the structure.

Next look at the following structure:

struct LGStruct3 {
    double a;
    int b;
    char c;
    short d;
    int e;
    struct LGStruct1 str;
}struct3;
Copy the code

So when we print sizeof, we get 48, so where does this 48 come from? According to the above principles, we calculate as follows:

A: 8 bytes in [0 to 7] B: 4 bytes in [8 to 11] C: 1 byte in [12] D: 2 bytes in [14 to 15] E: 4 bytes in [16 to 19] STR: [24 to 47] The size is 24 bytes, which is what we know from the output above.Copy the code

The sizeof the struct3 must be an integer multiple of the sizeof the maximum member. The maximum member in this structure is a double member of size 8 bytes, so we need to complete that, so we get 48. Then we move on to the following structure:

struct LGStruct4 {
    double a;
    int b;
    char c;
    short d;
    int e;
    struct LGStruct2 str;
}struct4;
Copy the code

So when we print sizeof(struct4), we get 40, so where does that 40 come from? According to the above principles, we calculate again as follows:

A: 8 bytes in [0 to 7] B: 4 bytes in [8 to 11] C: 1 byte in [12] D: 2 bytes in [14 to 15] E: 4 bytes in [16 to 19] STR: [24 to 40] The size is 16 bytes, which is what we know from the output above.Copy the code

The sizeof the struct4 must be an integer multiple of the sizeof the largest member. The largest member in this structure is a double member of size 8 bytes, so you have to complete that, so you get 40.

What did you do before the alloc method was called

Let’s start by looking at what the system did before the alloc method call and address a point left over from the previous article, which is why the source code flow and the program run flow are inconsistent. In the last article, we talked about the process from alloc to the memory of the object, and we opened it like this: [LGPerson alloc]->_objc_rootAlloc->callAlloc should go like this, and the system code flows like this, However, when we debug the stack with a breakpoint, the system does not execute this way, as shown in the figure below:

When I go to the breakpoint, the code runs objc_alloc first, but after looking at it a couple of times, the system doesn’t actually call this function, so how did it get here? Can we guess that somewhere the system manually modifies the call to the alloc method? So since we went to objc_alloc first can we look at where we called this function? We can see that there is only one place to call the objc_alloc function, as shown in the figure below:

Since there is only one place to call, then the problem is probably in this area, with curiosity, click open source, as follows:

static void 
fixupMessageRef(message_ref_t *msg)
{    
    msg->sel = sel_registerName((const char *)msg->sel);

    if (msg->imp == &objc_msgSend_fixup) { 
        if (msg->sel == @selector(alloc)) {
            msg->imp = (IMP)&objc_alloc;
        } else if (msg->sel == @selector(allocWithZone:)) {
            msg->imp = (IMP)&objc_allocWithZone;
        } else if (msg->sel == @selector(retain)) {
            msg->imp = (IMP)&objc_retain;
        } else if (msg->sel == @selector(release)) {
            msg->imp = (IMP)&objc_release;
        } else if (msg->sel == @selector(autorelease)) {
            msg->imp = (IMP)&objc_autorelease;
        } else {
            msg->imp = &objc_msgSend_fixedup;
        }
    } 
    else if (msg->imp == &objc_msgSendSuper2_fixup) { 
        msg->imp = &objc_msgSendSuper2_fixedup;
    } 
    else if (msg->imp == &objc_msgSend_stret_fixup) { 
        msg->imp = &objc_msgSend_stret_fixedup;
    } 
    else if (msg->imp == &objc_msgSendSuper2_stret_fixup) { 
        msg->imp = &objc_msgSendSuper2_stret_fixedup;
    } 
#if defined(__i386__)  ||  defined(__x86_64__)
    else if (msg->imp == &objc_msgSend_fpret_fixup) { 
        msg->imp = &objc_msgSend_fpret_fixedup;
    } 
#endif
#if defined(__x86_64__)
    else if (msg->imp == &objc_msgSend_fp2ret_fixup) { 
        msg->imp = &objc_msgSend_fp2ret_fixedup;
    } 
#endif
}
Copy the code

If SEL is @selector(alloc) and you change IMP to IMP &objc_alloc, the culprit has been found. Is that the only thing a curious person like me is going to be satisfied with? Of course not, finding out where it was modified doesn’t solve the problem, there’s an &objc_msgSend_fixup here that says, if the imp of an object is equal to that it was modified into IMP, what is objc_msgSend_fixup? Go ahead and see OBJC_EXTERN void objc_msgSend_fixup(void); This is just a declaration, and the return value is null, so it makes sense to determine whether the sel of the object exists, because we’re taking the address of a value that doesn’t exist, so the first time we go inside the if, we change imp to (IMP)&objc_alloc. Static void fixupMessageRef(message_ref_t * MSG); Function? A global search in the objc source code reveals that there is only one place to call this function, as shown below:

And then we click on the source code, Void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClass); void _read_images(uint32_t hCount, int totalClasses, int unoptimizedTotalClass); In this function, there is a bit of code, but I will focus on the fixupMessageRef function implementation, and then discard some of the other properties of the operation, the code looks like this:

As you can see from the source code, the system fixes the @selector reference first, then fixes the old objc_msgSend_fixup call site, and then continues to look for when the _read_images function was called, global search, and finds two places, Map_images_nolock (); map_images_nolock ();

This function is used to process the given image to which dyld is mapped. All class registrations and fixes are performed (or delayed discovery of missing superclasses, etc.) and the +load method is called. The simple idea is to read all the classes and call +load. Map_images_nolock = map_images_nolock = map_images_nolock = map_images_nolock = map_images_nolock _dyLD_OBJC_NOTIFy_register (&map_images, load_images, unmap_image); Here the map_images function is called, and then this call _dyLD_OBJC_NOTIFy_register (&map_images, load_images, unmap_image); The place where this function is going to be is _objc_init and then it becomes clear that this function is the entry point for loading our program. We finally figured out what happens before the entire alloc function is modified, and when the system fixes an object by calling alloc.

Calloc memory is opened

When we looked at the alloc process in the last article, we had the following code:

id obj; if (zone) { obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size); } else {obj = (id)calloc(1, size); }Copy the code

Object memory space opens up, and when we want to look at it, the calloc function can’t see the implementation, it can only see the declaration, so should we stop there? How can I give up? When we click on the calloc function, we enter the system’s include/malloc folder, so does Apple also have open source code? Libmalloc-317.100.9 libmalloc-317.100.9 libmalloc-317.100.9 libmalloc-317.100.9 libmalloc-317.100.9 libmalloc-317.100.9 libmalloc-317.100.9 You can just download it and use it, and then you can download it, and then you can open it and it’s already created, so you just need to run the mode.

Write a piece of code in main to debug the calloc function. The code is as follows:

#import <Foundation/Foundation.h> #import <malloc/malloc.h> int main(int argc, Const char * argv[]) {@autoreleasepool {// heap object 16 bytes aligned // Member variable 8 bytes aligned inside structure // Object object 16 bytes // Wild pointer memory access void *p = calloc(1, 40); NSLog(@"%lu",malloc_size(p)); NSLog(@"Hello, World!" ); } return 0; }Copy the code

And then when we click on the calloc implementation, we see that it directly returns another function, _malloc_zone_calloc, so we go back to _malloc_zone_calloc, and we’re going to focus on the return value, We’re going to go straight to line 1560 because it returns a PTR, but again it’s a calloc function, so we can’t see the implementation, but we can see that we’re doing an assignment here, Zone ->calloc: zone->calloc: zone->calloc: zone->calloc: zone->calloc: zone->calloc: zone->calloc: zone->calloc The console outputs nano_calloc and continues to look for the function implementation as follows:

static void *
nano_calloc(nanozone_t *nanozone, size_t num_items, size_t size)
{
	size_t total_bytes;

	if (calloc_get_size(num_items, size, 0, &total_bytes)) {
		return NULL;
	}

	if (total_bytes <= NANO_MAX_SIZE) {
		void *p = _nano_malloc_check_clear(nanozone, total_bytes, 1);
		if (p) {
			return p;
		} else {
			/* FALLTHROUGH to helper zone */
		}
	}
	malloc_zone_t *zone = (malloc_zone_t *)(nanozone->helper_zone);
	return zone->calloc(zone, 1, total_bytes);
}
Copy the code

In this function we can see the key field total_bytes, which is to get the allocation size of calloc(). And then we call the _nano_malloc_check_clear function, and the implementation of the function determines if the PTR exists, and the first time we go in, it doesn’t exist, so we look at the else branch, and we see that we call segregated_next_block, and then we go into the implementation of that function, Found to be a while loop with the following code:

static MALLOC_INLINE void * segregated_next_block(nanozone_t *nanozone, nano_meta_admin_t pMeta, size_t slot_bytes, unsigned int mag_index) { while (1) { uintptr_t theLimit = pMeta->slot_limit_addr; // Now capture the slot limit of uintptr_t b = OSAtomicAdd64Barrier(slot_bytes, (volatile int64_t *)&(pMeta->slot_bump_addr)); b -= slot_bytes; // The atomic operation returns the *next* free block of addr. Minus this is worth going to addr. If (b < theLimit) {// Are we within the current slot allocation? return (void *)b; Else {if (pMeta-> slot_bump_addr) {if (pMeta->slot_exhausted) {// Exhausted all the space in this interval? pMeta->slot_bump_addr = theLimit; return 0; Else {// One thread will grow the heap, and other threads will see it grow and retry allocation _malloc_LOCK_lock (&nanozone->band_resupply_lock[mag_index]); If (pMeta->slot_exhausted) {_malloc_lock_unlock(&nanozone->band_resupply_lock[mag_index]); return 0; // Toast } else if (b < pMeta->slot_limit_addr) { _malloc_lock_unlock(&nanozone->band_resupply_lock[mag_index]); continue; / /... The slot was successfully grown by the first recipient (not us). } else if (segregated_band_grow(nanozone, pMeta, slot_bytes, mag_index)) { _malloc_lock_unlock(&nanozone->band_resupply_lock[mag_index]); continue; / /... The trough has been successfully planted by us. Now try again. } else { pMeta->slot_exhausted = TRUE; pMeta->slot_bump_addr = theLimit; _malloc_lock_unlock(&nanozone->band_resupply_lock[mag_index]); return 0; } } } } }Copy the code

The whole process basically means that we keep iterating to find the space that can hold the required size. If we find the space, we return the space address directly. If we can’t find the space, we return 0. And that’s pretty much the end of calloc’s memory creation process.

conclusion

The previous example it is easy to understand, in development, how to optimize the volume, how to reduce the waste of memory space, how to design the structure, the reasonable use of memory space, general design structure, we can through simple operation, our members will, according to the size of occupy byte, sort order from big to small to build structures, Reduce unnecessary memory creation.