The essence of the isa

You need to understand the nature of ISA before studying Runtime, so that it will be easier to understand the Runtime later.

Before __arm64__, ISA was just a pointer, holding the memory address of an object or a class object. After __arm64__, Apple optimized ISA to become a union structure. Bitfields are also used to store more information.

We know that the ISA pointer of OC object does not directly point to the class object or metaclass object, but requires &ISA_mask to obtain the address of the class object or metaclass object through bit operation. Today we’ll find out why &isa_mask is needed to get the address of a class object or metaclass object, and why it works.

First find isa pointer in the source code, take a look at the essence of isa pointer.

Struct objc_object {private: isa_t isa; }Copy the code

The ISA pointer is actually an ISA_T Commons, so go inside ISA_T to see its structure

// truncated isa_t union isa_t {isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

#if SUPPORT_PACKED_ISA
# if __arm64__      
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 33; // MACH_VM_MAX_ADDRESS 0x1000000000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 19;
    # define RC_ONE (1ULL<<45)
    # define RC_HALF (1ULL<<18)
    };

# elif __x86_64__     
# define ISA_MASK 0x00007ffffffffff8ULL
# define ISA_MAGIC_MASK 0x001f800000000001ULL
# define ISA_MAGIC_VALUE 0x001d800000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 44; // MACH_VM_MAX_ADDRESS 0x7fffffe00000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 8;
# define RC_ONE (1ULL<<56)
# define RC_HALF (1ULL<<7)
    };

# else
# error unknown architecture for packed isa
# endif
#endif
Copy the code

In the source code above, ISA_t is of type union, and union stands for Commons. You can see that there is a structure in the common body. Inside the structure, variables are defined. The values after the variables represent how many bytes the variable takes up.

Commonality: In the C language programming of some algorithms, several different types of variables need to be stored in the same memory unit. That is, using an overlay technique, several variables overlay each other. In C language, the structure of several different variables occupying a segment of memory is called “common body” type structure, or common body for short.

Next, we will use the way of sharing to deeply understand why Apple uses sharing and the benefits of using sharing.

Explore process

Next, use code to mimic the underlying practice, creating a Person class with three member variables of type BOOL.

@interface Person : NSObject
@property (nonatomic, assign, getter = isTall) BOOL tall;
@property (nonatomic, assign, getter = isRich) BOOL rich;
@property (nonatomic, assign, getter = isHansome) BOOL handsome;
@end
Copy the code
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSLog(@"%zd", class_getInstanceSize([Person class]));
    }
    return0; } // Print the content // runtime-union search [52235:3160607] 16Copy the code

In the above code, Person contains three BOOL attributes, and the printed Person object occupies 16 memory space. That is (ISA pointer = 8) + (BOOL tall = 1) + (BOOL rich = 1) + (BOOL handsome = 1) = 13. Because of the memory alignment principle, the Person object takes up 16 memory space.

As mentioned above, variables in the common body can cover each other. Several different variables can be stored in the same memory unit, which can save memory space to a great extent.

So we know that there are only two cases where a BOOL can be 0 or 1, but it takes up one byte of memory, and one byte of memory has eight bits in it, and the binary is only 0 or 1. Is it possible to use one binary to represent a BOOL, that is, three BOOL values that end up using only three binary bits, one memory space? How to implement this approach?

First of all, if we use this method, we need to write the method declaration and implementation by ourselves. We can’t write the attribute, because once we write the attribute, the system will automatically add the member variable for us.

Alternatively, to store three BOOL values in one byte, we can add a member variable of type CHAR, which occupies one byte of memory space, that is, eight binary bits. You can use the last three bits to store three BOOL values.

@interface Person()
{
    char _tallRichHandsome;
}
Copy the code

For example, if the value of _tallRichHansome is 0b 0000 0010, then only the last three bits in the eight binary bits are used and 0 or 1 is assigned to them respectively to represent tall, rich, and handsome values. As shown in the figure below

So the question is how do you get the value of one of the eight bits, or assign a value to one of the eight bits?

The values

If you want to extract rich’s value from a char member whose binary value is 0b 0000 0010, you can use & to remove the value.

& : bitwise and, same true is true, all else is false.

// Select the value of the last digit (tall 0000 0010 & 0000 0100 ------------ 0000 0000) Rich 0000 0010 & 0000 0010 ------------ 0000 0010 // Fetch the value of the penultimate is 1, the other bits are set to 0Copy the code

Bitwise and can be used to take out specific bits. If you want to take out any bits, set that position to 1, and set the other bits to 0. Then carry out bitwise and calculation with the original data, you can take out specific bits.

Then you can write the get method as follows

#define TallMask 0b00000100 // 4
#define RichMask 0b00000010 // 2
#define HandsomeMask 0b00000001 // 1

- (BOOL)tall
{
    return!!!!! (_tallRichHandsome & TallMask); } - (BOOL)rich {return!!!!! (_tallRichHandsome & RichMask); } - (BOOL)handsome {return!!!!! (_tallRichHandsome & HandsomeMask); }Copy the code

Use two in the above code!! (not) to change the value to bool. Use the same example above

// Fetch the penultimate rich 0000 0010 // _tallRichHandsome & 0000 0010 // RichMask ------------ 0000 0010 // Fetch the value of rich as 1, and set all other bits to 0Copy the code

The value (_tallRichHandsome & TallMask) is 0000 0010 (2), but we need a BOOL value of 0 or 1. The 2 turns the 2 into a 0 and then into a 1. Conversely, if the value obtained by bitwise and is 0,!! 0 converts 0 first to 1 and then to 0. Therefore use!! Two non-operations convert the value to 0 or 1 to represent the corresponding value.

Mask: There are three macros defined in the above code that are used to perform bitwise and operations and extract corresponding values. The values used for bitwise and (&) operations are generally called masks.

To make it clear which bit the mask is for, the definitions of the three macros above can be optimized with << (left shift)

<< : indicates a move to the left. The following figure is an example.

The above macro definition can then be optimized using << (left shift) into the following code

#define TallMask (1<<2) // 0b00000100 4
#define RichMask (1<<1) // 0b00000010 2
#define HandsomeMask (1<<0) // 0b00000001 1
Copy the code

Set the value

Set a value that is, one will be set to 0 or 1, can use | the bitwise or operator. | : bitwise or, as long as there is a 1 to 1, otherwise 0.

If we want to set a certain position to 1, we can perform bitwise or operation on the original value and mask. For example, we want to set TALL to 1

/ / will be the third from bottom tall set to 1, 0010/0000 / _tallRichHandsome | 0100/0000 / TallMask -- -- -- -- -- -- -- -- -- -- -- -- 0000/0110 / tall set to 1, the other values are the sameCopy the code

If you want to set a position to 0, you need to invert the mask bit by bit (~ : invert the mask bit by bit), and then perform bitwise and operation with the original value.

// Set the penultimate rich to 0 0000 0010 // _tallRichHandsome & 1111 1101 // RichMask to the reverse ------------ 0000 0000 // Set rich to 0, all other bits remain unchangedCopy the code

The internal implementation of the set method is as follows

- (void)setTall:(BOOL)tall
{
    if(tall) {/ / if you need to value set to 1 / / bitwise or mask _tallRichHandsome | = TallMask; }else{// If the value needs to be set to 0 // Bitwise and (bitwise reversed mask) _tallRichHandsome &= ~TallMask; } } - (void)setRich:(BOOL)rich
{
    if (rich) {
        _tallRichHandsome |= RichMask;
    }else{
        _tallRichHandsome &= ~RichMask;
    }
}
- (void)setHandsome:(BOOL)handsome
{
    if (handsome) {
        _tallRichHandsome |= HandsomeMask;
    }else{ _tallRichHandsome &= ~HandsomeMask; }}Copy the code

After writing the set and get methods, check the code to see if the value can be set and value successfully.

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *person  = [[Person alloc] init];
        person.tall = YES;
        person.rich = NO;
        person.handsome = YES;
        NSLog(@"tall : %d, rich : %d, handsome : %d", person.tall,person.rich,person.handsome);
    }
    return 0;
}
Copy the code

Print the content

Runtime-union exploration [58212:3857728] Tall: 1, rich: 0, handsome: 1Copy the code

You can see that the above code assigns and evaluates normally. However, the code still has some limitations. When new attributes need to be added, the above work needs to be repeated, and the code is not readable. The above code is then optimized using the bit-domain properties of the structure.

A domain

Optimizing the above code to use structural position fields makes the code more readable. Bit field declaration bit domain: bit field length;

Note the following three points when using a bit field: 1. If a byte does not have enough space to store another bit field, store the bit field from the next cell. You can also intentionally start a field with the next cell. 2. The length of the bit field cannot be greater than the length of the data type itself. For example, the length of an int type cannot be greater than 32 bits binary. 3. The bit domain can have no bit domain name. In this case, it is only used for filling or adjusting position. Nameless bit fields are not available.

The above code was optimized using the structural position field.

@interface Person() { struct { char handsome : 1; Char rich: 1; Char tall: 1; }_tallRichHandsome; }Copy the code

The set and GET methods can be assigned and evaluated directly through the structure

- (void)setTall:(BOOL)tall
{
    _tallRichHandsome.tall = tall;
}
- (void)setRich:(BOOL)rich
{
    _tallRichHandsome.rich = rich;
}
- (void)setHandsome:(BOOL)handsome
{
    _tallRichHandsome.handsome = handsome;
}
- (BOOL)tall
{
    return _tallRichHandsome.tall;
}
- (BOOL)rich
{
    return _tallRichHandsome.rich;
}
- (BOOL)handsome
{
    return _tallRichHandsome.handsome;
}
Copy the code

Verify with your code that you can assign or value correctly

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *person  = [[Person alloc] init];
        person.tall = YES;
        person.rich = NO;
        person.handsome = YES;
        NSLog(@"tall : %d, rich : %d, handsome : %d", person.tall,person.rich,person.handsome);
    }
    return 0;
}
Copy the code

Start by making a breakpoint at log to see the values stored in _tallRichHandsome

Since _tallRichHandsome occupies one memory space, that is, 8 binary bits, we convert the 05 hexadecimal to binary view

As can be seen from the figure above, the third-to-last digit (TALL) is 1, the next-to-last digit (rich) is 0, and the next-to-last digit (handsome) is 1, which seems to be the same as the value we set in the above code. It can be successfully assigned.

Run-time Union explore [59366:4053478] Tall: -1, rich: 0, handsome: -1

We set the value of tall and handsome to YES. It is reasonable to say that the output value should be 1. Why is the output value -1?

In addition, by printing the value stored in _tallRichHandsome, it is confirmed that the values of tall and handsome are 1. Again, we print the value of the variable in the _tallRichHandsome structure.

As can be seen from the figure above, the value of Handsome is 0x01, which is converted into binary by calculator

You can see that it does have a value of 1, why print it out as -1? At this point it should be obvious that there is a problem inside the get method. We go inside the GET method and look at the value we got by printing the breakpoint.

- (BOOL)handsome
{
    BOOL ret = _tallRichHandsome.handsome;
    return ret;
}
Copy the code

Prints the value of RET

The value of RET is 255, namely 1111 1111, which can explain why the printed value is -1. First, the value of Handsome obtained from the structure is 0B1, which occupies only 1 bit in a memory space, but the BOOL value occupies only 8 bits in a memory space. When the value of only 1 bit is expanded to 8 bits, the remaining space is filled to 1 according to the value of the previous bit, so the ret value is mapped to 0B 11111 1111.

11111111At one byte, the signed number is -1 and the unsigned number is 255. So we print the value -1 when we print

In order to verify that when the value of 1 bit expands to 8 bits, all the bits will be filled, we set tall, rich and handsome values to occupy two bits.

@interface Person()
{
    struct {
        char tall : 2;
        char rich : 2;
        char handsome : 2;
    }_tallRichHandsome;
}
Copy the code

At this point, it is found that the value can be printed normally. Runtime-union exploration [60827:4259630] Tall: 1, rich: 0, handsome: 1

This is because we get it inside the GET method_tallRichHandsome.handsomeIt’s going to be two0b 01In this case, when an 8-bit BOOL is assigned, the preceding null value is automatically completed to 0 based on the preceding bit, so the return value is0b 0000 0001, so the printed value is 1.

So the above questions can also be used!! Double exclamation point to solve the problem. !!!!! The principle has been explained above and will not be repeated here.

Optimized code using the structural position field

@interface Person()
{
    struct {
        char tall : 1;
        char rich : 1;
        char handsome : 1;
    }_tallRichHandsome;
}
@end

@implementation Person

- (void)setTall:(BOOL)tall
{
    _tallRichHandsome.tall = tall;
}
- (void)setRich:(BOOL)rich
{
    _tallRichHandsome.rich = rich;
}
- (void)setHandsome:(BOOL)handsome
{
    _tallRichHandsome.handsome = handsome;
}
- (BOOL)tall
{
    return!!!!! _tallRichHandsome.tall; } - (BOOL)rich {return!!!!! _tallRichHandsome.rich; } - (BOOL)handsome {return!!!!! _tallRichHandsome.handsome; }Copy the code

In the above code, the bit field using the structure does not need to use the mask, which makes the code much more readable, but the efficiency is much worse than the way of using the bit operation directly. If you want to read and store the data efficiently and have a strong readability, you need to use the common body.

The appropriate

In order to make the code store data efficiently and have a strong readability, we can use the common body to enhance the code readability, and use the bit operation to improve the efficiency of data access.

Code optimized using Commons

#define TallMask (1<<2) // 0b00000100 4
#define RichMask (1<<1) // 0b00000010 2
#define HandsomeMask (1<<0) // 0b00000001 1

@interface Person() { union { char bits; Struct {char tall: 1; struct {char tall: 1; char rich : 1; char handsome : 1; }; }_tallRichHandsome; } @end @implementation Person - (void)setTall:(BOOL)tall
{
    if (tall) {
        _tallRichHandsome.bits |= TallMask;
    }else{
        _tallRichHandsome.bits &= ~TallMask;
    }
}
- (void)setRich:(BOOL)rich
{
    if (rich) {
        _tallRichHandsome.bits |= RichMask;
    }else{
        _tallRichHandsome.bits &= ~RichMask;
    }
}
- (void)setHandsome:(BOOL)handsome
{
    if (handsome) {
        _tallRichHandsome.bits |= HandsomeMask;
    }else{
        _tallRichHandsome.bits &= ~HandsomeMask;
    }
}
- (BOOL)tall
{
    return!!!!! (_tallRichHandsome.bits & TallMask); } - (BOOL)rich {return!!!!! (_tallRichHandsome.bits & RichMask); } - (BOOL)handsome {return!!!!! (_tallRichHandsome.bits & HandsomeMask); }Copy the code

In the above code, the more efficient way of storing values is bitwise operation, and the data is stored using union Commons. Increase the readability of code while increasing read efficiency.

The _tallRichHandsome share occupies only one byte, because tall, rich, and Handsome all occupy only one bit of binary space, so the structure occupies only one byte, and bits of the char type also occupy only one byte. They are all in the share, so the memory of one byte can be shared.

Structs are not used in the get and set methods. Structs are used only to increase readability, to indicate which values are stored in the common and how much space each value occupies. At the same time, bit operation is also used for the value of storage to increase efficiency, and the storage is shared. The location of storage is still controlled by bit operation with the mask.

At this point the code has been optimized, efficient and readable, so at this time to look back at the SOURCE of the ISA_T Commons

Isa_t source

At this point we are looking back at the isa_t source code

// truncated isa_t union isa_t {isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

# if __arm64__
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 33; // MACH_VM_MAX_ADDRESS 0x1000000000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 19;
# define RC_ONE (1ULL<<45)
# define RC_HALF (1ULL<<18)
    };
#endif
};
Copy the code

After the above bit operation, bit field and common body analysis, now look at the source can be very clear understanding of the content. The source code stores 64-bit values in common, which are displayed in the structure by performing a bit operation on the bits to extract the corresponding position.

Shiftcls stores the memory address information of Class and meta-class objects. As we mentioned earlier in the nature of OC objects, the isa pointer of the object needs to be treated with a bit and ISA_MASK to get the real address of the Class object.

So let’s revisit the ISA_MASK value 0x0000000ffffffff8ULL and convert it to binary

In the figure above, it can be seen that 33 bits of ISA_MASK are 1 when the value is converted into binary. As mentioned above, the function of bitwise and can be used to extract the value of these 33 bits. It is obvious at this point that bitwise and operations with ISA_MASK will fetch the value of Class or meta-class.

In addition, it can be seen that the last three digits of ISA_MASK are 0, so the last three digits of any number after bit-sum operation with ISA_MASK must be 0. Therefore, the last three digits of the memory address of any class object or metaclass object must be 0, and the last digit of the hexadecimal address must be 8 or 0.

Information stored in ISA and what it does

Take the structure out and mark what this information does.

Struct {// 0 represents a normal pointer to a Class, Meta-Class object memory address. // 1 represents the optimized use of bitfields to store more information. uintptr_t nonpointer : 1; // Uintptr_t has_assoc: 1; Uintptr_t has_cxx_dtor: 1; // Uintptr_t shiftcls: 33; // Uintptr_t magic: 6; // If there is a weak reference to point to. uintptr_t weakly_referenced : 1; // Whether the object is releasing the Uintptr_t dealLocating: 1; Uintptr_t has_sideTABLE_rc: 1; Uintptr_t extra_rc: 19; };Copy the code

validation

Verify the location and function of the above information by using the following code

(void)viewDidLoad {[super viewDidLoad]; (void)viewDidLoad {[super viewDidLoad]; Person *person = [[Person alloc] init]; NSLog(@"%p",[person class]);
    NSLog(@"% @",person);
}
Copy the code

First print the address of the Person object, then print the ISA pointer address of the Person object via a breakpoint.

Take a look at the print first

Converts the class object address to binary

Convert the ISA pointer address of Person to binary

shiftcls : shiftclsBy comparing the two figures above, we can find that the 33 bit binary contents of the storage object address are exactly the same.

extra_rc : extra_rcThe value stored in the 19 bits of person is the reference count minus one, because the reference count of person is 1, so at this pointextra_rcThe 19-bit binary of the.

magic : magicThe six bits of the binary are used to tell during debugging if the object is not initialized, as in the code above, when Person has been initialized011010Macros defined in the Commons# define ISA_MAGIC_VALUE 0x000001a000000001ULLThe value of the.

Nonpointer: This is definitely using the optimized ISA, sononpointerThe value of theta must be 1

Because the person object has no associated object and no weak pointer has been referenced, it can be seen that has_assoc and Weakly_referenced values are 0, and then we add weak reference and associated object to the Person object. Look at changes in has_ASsoc and Weakly_referenced.

- (void)viewDidLoad {
    [super viewDidLoad];
    Person *person = [[Person alloc] init];
    NSLog(@"%p",[person class]); // Add a weak reference to person __weak person *weakPerson = person; // Add the associated object objc_setAssociatedObject(person, @) for person"name"The @"xx_cc", OBJC_ASSOCIATION_RETAIN_NONATOMIC);
    NSLog(@"% @",person);
}
Copy the code

Reprinting the ISA pointer address of Person and converting it to binary can see that has_ASsoc and Weakly_referenced values both become 1

Note: As long as the associated object is set or the weak reference refers to the objecthas_assocandweakly_referencedIs changed to 1, regardless of whether the associated object is then set to nil or weak references are broken.

If the associated object is not set, the object will be released faster, because the object will be freed from the associated object when it is destroyed. Take a look at the source code for object destruction

void *objc_destructInstance(id obj) 
{
    if(obj) { Class isa = obj->getIsa(); // is there a c++ destructorif(isa->hasCxxDtor()) { object_cxxDestruct(obj); } // If there is an associated object, remove itif (isa->instancesHaveAssociatedObjects()) {
            _object_remove_assocations(obj);
        }
        objc_clear_deallocating(obj);
    }
    return obj;
}
Copy the code

The __arm64__ framework is used to store more information about isa Pointers than just the Class or meta-class address. Shiftcls is used to store the Class or meta-class address. A bitwise & operation with ISA_MASK is required to extract its memory address value.

Underlying principles article column

Underlying principles article column


Welcome to point out any mistakes in the article. I am XX_CC, a long grown but not enough of a guy.