Today’s directory -iOS & OpenGL & OpenGL ES & Metal continues its exploration of the memory alignment principle & Calloc.

# First, memory alignment

### 1. Memory alignment

1. Data member alignment rules: The first data member of a struct or union is stored at offset 0, and the starting position of each subsequent data member is from the size of the member or the size of the member’s children (as long as the member has children, such as: Array, structure, etc.). (for example: int is 4 bytes, so it is stored from an integer multiple of 4)

2. Struct as members: If a structure has some struct members, the struct members are stored as integer multiples of the size of the largest element in the structure. Char (1) int (4) double (8); char (1) int (4); B should be stored from an integer multiple of 8.

3. The total sizeof the structure, the result of sizeof, must be an integer multiple of the largest internal member of the structure.

### 2. Code interpretation

Let’s start with a graph of reference values

The code:

``Struct YStruct1 {char a; // int YStruct1 {char a; // char = 1 double b; // char = 1 double b; // char = 1 double b Int c; int c; // Start with 16, just 16 is a multiple of 4, and end with 16~19 short d; } MyStruct1;} MyStruct1;} MyStruct1;} MyStruct1; struct YStruct2 { double b; // 0~7 char a; // 8 int c; // 9 is not an integer multiple of 4. // Start 16, 16~17. The last one is 17, a multiple of 8, and the last one is 16~24. struct YStruct3 { double b; // 0~7 int c; // 8~11 char a; // 12 short d; / / 13 ~ 14. Completion, and finally 13~16} MyStruct3; // Let's look at the structure inside the structure. Struct YStruct4 {double b; // 0~7 int c; // 8~11 char a; // 12 short d; // 13~14 struct YStruct2 e; // We know that the largest element in YStruct2 is 8, so we need to start with a multiple of 8, so we need to start with 16, (13~14) and complete with (15), 16+24. And then 16 to 40, which happens to be an integer of 8. Print: NSLog (@ "% lu lu lu lu - % % - %", sizeof (MyStruct1), sizeof (MyStruct2), sizeof (MyStruct3), sizeof (MyStruct4)); Results: 24-24-16-40Copy the code``

There is also some memory optimization involved, for example, if every attribute is 8-byte aligned, YStruct3 should take 32 bytes, but after memory alignment, it only takes 16 bytes, saving memory space

### 3. The object allocates memory VS the system allocates memory

The size method was mentioned at the end of the previous article. size_t size = cls->instanceSize(extraBytes); Let’s create an object, declare four properties, and call this method to print size:

``Size_t class_getInstanceSize(Class CLS) {if (! cls) return 0; return cls->alignedInstanceSize(); }Copy the code``

Person. H statement only

``````@interface Person : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, assign) int age;
@property (nonatomic, assign) long height;
@property (nonatomic, strong) NSString *hobby;

@end
Copy the code``````

ViewController.m

``Person *per = [Person alloc]; //isa //8 per.name = @"superMan"; //8 per.age = 18; //4 byte alignment + 4 per.height = 185; //8 per.hobby = @" female "; //8 // print: NSLog(@"%lu",class_getInstanceSize([per class])); // Result: 40Copy the code``

There are only 4 attributes, 4 x 8 = 32. Why 40? Here’s a little tidbit: the first property of an object is ISA. So 5 x 8 is equal to 40.

So I’ll add two more char attributes to see if I have the memory optimization mentioned above

``//person.h add @property (nonatomic) char ch1; @property (nonatomic) char ch2; //ViewController.m per.ch1 = 'a'; per.ch2 = 'b'; // print: NSLog(@"%lu",class_getInstanceSize([per class])); // Result: 40Copy the code``

Use LLDB debugging to look at the value of the memory segment:

As can be seen from the figure above, in memory, the system helped us to do memory optimization, resulting in the position of some attributes changed

So let’s create a couple of properties to see if the object’s memory allocation is the same as the system’s memory allocation. Call the malloc_size method directly

``// print: NSLog(@"%lu - %lu",class_getInstanceSize([per class]),malloc_size((__bridge const void *)(per))); // Result: 40-48Copy the code``

Strange!

Why is it different? Libmalloc calloc libmalloc calloc libmalloc

# Second, the calloc

After configuring the libmalloc source code, we’ll explore the following code directly:

``// Why pass 40? Void *p = calloc(1, 40) void *p = calloc(1, 40); NSLog(@"%lu",malloc_size(p));Copy the code``

We also step by step source code and process

### 1.`calloc`methods

``````void *
calloc(size_t num_items, size_t size)
{
void *retval;
retval = malloc_zone_calloc(default_zone, num_items, size);
if (retval == NULL) {
errno = ENOMEM;
}
return retval;
}
Copy the code``````

### 2,`malloc_zone_calloc`methods

``// This method is very long, let's get straight to the point! Void * malloc_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size) { MALLOC_TRACE(TRACE_calloc | DBG_FUNC_START, (uintptr_t)zone, num_items, size, 0); void *ptr; if (malloc_check_start && (malloc_check_counter++ >= malloc_check_start)) { internal_check(); } // Go straight to here! The point of interruption is that it goes here // but there is a problem point, and then the calloc method becomes a recursion. PTR = zone->calloc(zone, num_items, size); if (malloc_logger) { malloc_logger(MALLOC_LOG_TYPE_ALLOCATE | MALLOC_LOG_TYPE_HAS_ZONE | MALLOC_LOG_TYPE_CLEARED, (uintptr_t)zone, (uintptr_t)(num_items * size), 0, (uintptr_t)ptr, 0); } MALLOC_TRACE(TRACE_calloc | DBG_FUNC_END, (uintptr_t)zone, num_items, size, (uintptr_t)ptr); return ptr; }Copy the code``

Search the method globally, then break, walk, sure enough, in ~

### 3,`default_zone_calloc`methods

``````static void *
default_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size)
{
zone = runtime_default_zone();

return zone->calloc(zone, num_items, size);
}
Copy the code``````

Continue to`p zone->calloc`

### 4,`nano_calloc`methods

``static void * nano_calloc(nanozone_t *nanozone, size_t num_items, size_t size) { size_t total_bytes; If (calloc_get_size(num_items, size, 0, &total_bytes)) {return NULL; If (total_bytes <= NANO_MAX_SIZE) {void *p = _nano_malloc_check_clear();  total_bytes, 1); if (p) { return p; } else { /* FALLTHROUGH to helper zone */ } } malloc_zone_t *zone = (malloc_zone_t *)(nanozone->helper_zone); return zone->calloc(zone, 1, total_bytes); }Copy the code``

### 5,`_nano_malloc_check_clear`methods

So this is a long block of code, so let’s just look at the bits that are useful to us, and what we want to do is look at how we get to 48, and look for the keyword byte

``static void * _nano_malloc_check_clear(nanozone_t *nanozone, size_t size, boolean_t cleared_requested) { MALLOC_TRACE(TRACE_nano_malloc, (uintptr_t)nanozone, size, cleared_requested, 0); void *ptr; size_t slot_key; Byte size_t slot_bytes = segregated_size_to_fit(nanozone, size, &slot_key); // Note slot_key is set here mag_index_t mag_index = nano_mag_index(nanozone); nano_meta_admin_t pMeta = &(nanozone->meta_data[mag_index][slot_key]); ptr = OSAtomicDequeue(&(pMeta->slot_LIFO), offsetof(struct chained_block_s, next)); Else {PTR = segregated_next_block(nanozone, pMeta, slot_bytes, mag_index); } if (cleared_requested && ptr) { memset(ptr, 0, slot_bytes); // TODO: Needs a memory barrier after memset to ensure zeroes land first? } return ptr; }Copy the code``

### 6,`segregated_size_to_fit`methods

``static MALLOC_INLINE size_t segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey) { size_t k, slot_bytes; #define SHIFT_NANO_QUANTUM 4 #define NANO_REGIME_QUANTA_SIZE (1 << SHIFT_NANO_QUANTUM) // 1 left shift 4 bits = 16 */ if (0 == size) { size = NANO_REGIME_QUANTA_SIZE; K = (size + nano_regime_quanta_size-1) >> SHIFT_NANO_QUANTUM; // round up and shift for number of quanta slot_bytes = k << SHIFT_NANO_QUANTUM; // multiply by power of two quanta size *pKey = k - 1; // Zero-based! return slot_bytes; }Copy the code``

Look at the memory alignment algorithm here

``//SHIFT_NANO_QUANTUM = 4 // If size = 0, assign 16. If (0 == size) {size = NANO_REGIME_QUANTA_SIZE; // Historical behavior} // k = size + 16-1; k = (size + NANO_REGIME_QUANTA_SIZE - 1) >> SHIFT_NANO_QUANTUM; Slot_bytes = k << SHIFT_NANO_QUANTUM; 0011 0111 >>4 0000 0011 <<4 0011 0000 2 ^ 5 +2 ^ 4 = 32 + 16 = 48!! At this point, we can see that moving 4 to the right + 4 to the left is 16-byte alignment. (x + WORD_MASK) &~ WORD_MASK (8+7) &~ 7 = 8 (x + WORD_MASK (8+7) &~ 7 = 8 That's 3 to the right plus 3 to the left, so that's 8 bytes alignment because 2 to the fourth is 16,2 to the third is 8 times PICopy the code``

# Third, summary

• Object attributes are aligned with 8 bytes (minimum return 16 bytes)
• The object itself is 16-byte aligned
• Because memory is contiguous, 16-byte alignment avoids risk and fault tolerance, preventing issues such as access overflow and wild Pointers
• At the same time, it also improves the efficiency of addressing access, i.e., space swap time

As for the previous article, we solved the memory alignment principle and Calloc’s question, and the isa exploration is in the next article