I. Consortium

1. The concept

Union is a special data type whose purpose is to save memory. Multiple data types can be defined within the federation, but only one data type can be represented at a time, and all data types share the same memory segment. The size of the union is equal to the largest memory footprint of the defined data type.

2. Mutually exclusive assignment/shared memory

Any data member defined by the union is allowed to be loaded, but only one data member can be represented at a time, using an override technique;

Such as:

union Test {
    char name;
    int age;
    long height;
};

void printUnion(union Test t) {
    printf("%c\n",t.name);
    printf("%d\n",t.age);
    printf("%ld\n",t.height);
    printf("---------\n");
}

int main(int argc, const char * argv[]) {
    union Test t;
    t.name = 'a';
    printUnion(t);
    
    t.age = 200;
    printUnion(t);
    
    t.height = 10000;
    printUnion(t);

    return 0;
}
Copy the code

Output result:

a
97
97
---------
\310
200
200
---------

10000
10000
---------
Copy the code

3. Memory length occupied by the union

Here are the rules:

  • The length of memory occupied by the union variable is equal to that of the longest member;

Of course, the memory size of the union is also limited by memory alignment rules, which distinguish between different architectures, so I won’t go into details here.

In addition, unions are often used with bitfields. In short, the bitfield specifies that a member variable occupies a fixed number of digits without using the size of its type. It has several characteristics:

  1. The size of the bitfield cannot exceed the size of the type itself;
  2. Bit-field units are bits rather than bytes;

Here’s an example:

struct {
    int aa;
    long bb;
    long cc;
} s1;

struct {
    int aa : 32;
    long bb : 32;
    long cc : 1;
} s2;
Copy the code
printf("%d\n",sizeof(s1)); //24 printf("%d\n",sizeof(s2)); / / 16Copy the code

The s1 structure is 4 + 8 + 8 = 20, then aligned with the 8 bytes of long to 24 bytes;

With bitfields, 32/8 + 32/8 + 1/8 = 4 + 4 + 1 = 9 bytes, and then aligned with the 8 bytes of long to 16;

Ii. Isa source code comb

First of all, we definitely want to look at ISA directly, so if we try it, we will find the isa interface, but will be prompted:

That is, in the original version obj could access the ISA pointer directly, but now it can only be accessed through object_getClass.

Object_getClass object_getClass object_getClass object_getClass object_getClass

Class object_getClass(id obj)
{
    if (obj) return obj->getIsa();
    else return Nil;
}
Copy the code

GetIsa () = getIsa(); getIsa() = getIsa();

There must be a macro definition that distinguishes the environment, and the key one is SUPPORT_TAGGED_POINTERS, which brings us to the first problem

1. Tagged Pointer

When it comes to Tagged Pointer, most people should have a deep understanding of Tagged Pointer.

  1. Tagged Pointer refers to the packaging types such as NSNumber, NSDate, and NSString.
  2. During the conversion from 32-bit to 64-bit architectures, Pointers to wrapper types expand from 4 bytes to 8 bytes, which is wasteful. Tagged Pointer A Pointer is divided into flag and data. It does not point to an actual memory address.
  3. Flag part indicates that the Pointer is Tagged Pointer. Data part contains actual data.
  4. Because it does not point to the actual memory address, there is no wrapper type of real objects in the heap memory, so there is no malloc, dealloc, etc., which speeds up the operation.

To sum up:

2. SUPPORT_TAGGED_POINTERS

Because Tagged Pointer does not point to an actual Pointer, objC’s ISA system has a problem with the logic of returning an ISA Pointer directly.

For example, when NSNumber directly returns ISA, it returns a Tagged Pointer that does not point to a class object, so you need to be compatible with Tagged Pointer. The SUPPORT_TAGGED_POINTERS macro is defined to support this situation by designing two ISA() methods based on whether Tagged Pointer is supported or not:

If Tagged Pointer is not supported:

// not SUPPORT_TAGGED_POINTERS

inline Class
objc_object::getIsa() 
{
    return ISA();
}
Copy the code

It is easy to call objc_Object’s ISA method to return a pointer.

If Tagged Pointer is supported:

// SUPPORT_TAGGED_POINTERS inline Class objc_object::getIsa() { if (fastpath(! isTaggedPointer())) return ISA(); extern objc_class OBJC_CLASS_$___NSUnrecognizedTaggedPointer; uintptr_t slot, ptr = (uintptr_t)this; Class cls; slot = (ptr >> _OBJC_TAG_SLOT_SHIFT) & _OBJC_TAG_SLOT_MASK; cls = objc_tag_classes[slot]; if (slowpath(cls == (Class)&OBJC_CLASS_$___NSUnrecognizedTaggedPointer)) { slot = (ptr >> _OBJC_TAG_EXT_SLOT_SHIFT) & _OBJC_TAG_EXT_SLOT_MASK; cls = objc_tag_ext_classes[slot]; } return cls; }Copy the code

In the above code, the first sentence determines whether the Pointer is Tagged Pointer. If it is not still called ISA() to find the Pointer, if it is, the processing logic related to Tagged Pointer is entered.

The fastPath macro definition is a compiler feature that tells the compiler that code in brackets has a high probability of being 1, i.e. not Tagged Pointer; The opposite of fastPath is slowPath. Fastpath is a way for the compiler to reduce the performance penalty of instruction jumps by following the most likely code immediately after it during compilation.

Now that the logic before the ISA() method is clear, let’s summarize the logic:

Next look at the ISA() method……

3. Structure differentiation

A search for objc_object::ISA will also find two macros that affect the logic of SUPPORT_NONPOINTER_ISA. The macros define the definition logic as follows:

#if ! SUPPORT_INDEXED_ISA && ! SUPPORT_PACKED_ISA # define SUPPORT_NONPOINTER_ISA 0 #else # define SUPPORT_NONPOINTER_ISA 1 #endifCopy the code

Instead of focusing on SUPPORT_PACKED_ISA, take a look at the definition of SUPPORT_INDEXED_ISA:

#if __ARM_ARCH_7K__ >= 2 || (__arm64__ && ! __LP64__) # define SUPPORT_INDEXED_ISA 1 #else # define SUPPORT_INDEXED_ISA 0 #endifCopy the code

A global search for __ARM_ARCH_7K__ in objc source will find:

This code is used to determine whether the schema is armv7K or arm64_32.

In other words:

  1. Arm64_32 refers to the arm64 architecture with a 4-byte pointer on watchOS.
  2. Armv7k is a variant of ARMV7 on 32-bit systems, also used for watchOS;

Both arm64_32 and armV7K are designed for WatchOS, so we can conclude that for iOS devices:

  1. SUPPORT_INDEXED_ISA = 0;
  2. SUPPORT_NONPOINTER_ISA = 1;

So, the code will be clearer later

4. Source code analysis

The ISA() function is coded as follows:

inline Class objc_object::ISA(bool authenticated) { ASSERT(! isTaggedPointer()); return isa.getDecodedClass(authenticated); }Copy the code

Continue looking at Isa.getDecodedClass:

inline Class
isa_t::getDecodedClass(bool authenticated) {
#if SUPPORT_INDEXED_ISA
    if (nonpointer) {
        return classForIndex(indexcls);
    }
    return (Class)cls;
#else
    return getClass(authenticated);
#endif
}
Copy the code

Since SUPPORT_INDEXED_ISA = 0, look directly at the getClass() method:

inline Class isa_t::getClass(MAYBE_UNUSED_AUTHENTICATED_PARAM bool authenticated) { #if SUPPORT_INDEXED_ISA return cls; #else uintptr_t clsbits = bits; # if __has_feature(ptrauth_calls) # if ISA_SIGNING_AUTH_MODE == ISA_SIGNING_AUTH // Most callers aren't security critical, so skip the // authentication unless they ask for it. Message sending and // cache filling are protected by the auth code in msgSend. if (authenticated) { // Mask off all bits besides the class pointer and signature. clsbits &= ISA_MASK;  if (clsbits == 0) return Nil; clsbits = (uintptr_t)ptrauth_auth_data((void *)clsbits, ISA_SIGNING_KEY, ptrauth_blend_discriminator(this, ISA_SIGNING_DISCRIMINATOR)); } else { // If not authenticating, strip using the precomputed class mask. clsbits &= objc_debug_isa_class_mask; } # else // If not authenticating, strip using the precomputed class mask. clsbits &= objc_debug_isa_class_mask; # endif # else clsbits &= ISA_MASK; # endif return (Class)clsbits; #endif }Copy the code

This code is a bit more complex, but there are really only three parts:

  1. SUPPORT_INDEXED_ISAIn iOS, the value must be 0.
  2. __has_feature(ptrauth_calls)Logic;
  3. clsbits &= ISA_MASK;

__has_feature is a compiler feature, and ptrauth_calls refer to Pointer Authentication Codes, or PACs, introduced in A12, starting with iPhone X/XR/XS.

To put it simply, because the 64-bit architecture, the basic use of pointer, so store a pointer signature in high place, when used to verify to prevent tampering, can see why the ARM64e pointer address is free to support PAC? I will not go into details here.

In summary, PACs technology also affects the addressing logic of ISA Pointers. Bypassing the PACS-related code, isa’s addressing logic is clear:

inline Class
isa_t::getClass(MAYBE_UNUSED_AUTHENTICATED_PARAM bool authenticated) {
    uintptr_t clsbits = bits;
    clsbits &= ISA_MASK;
}
Copy the code

That is, take out bits and use ISA_MASK for mask processing to obtain the real ISA address.

5. Result verification

The conclusion has been reached through the source code, so next to verify.

Run on iphone7 iOS12, not iphoneX and A12 above, so ISA_MASK 0x0000000ffffffff8ULL, the test code is as follows:

int main(int argc, char * argv[]) {
    NSObject *obj1 = [NSObject new];
    NSObject *obj2 = [NSObject new];

    Class cls1 = object_getClass(obj1);
    Class cls2 = object_getClass(obj2);

    NSLog(@"cls1:%@",cls1);
    NSLog(@"cls2:%@",cls2);
}
Copy the code

The experimental results are as follows:

Let’s look at Obj2 again:

Obj2 stores the same pointer as obj1, so there is no need to MASK it.

The x/x4w operations mentioned above involve the size side of the problem, so we will not repeat ~~~

Isa structure details

Objc4-818.2 isa struct

union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating() {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating() {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code

Note here that CLS is set to private in objC4-818.2. The CLS is not allowed to be accessed directly due to ptrauth. To operate on the ISA pointer, you must use the setClass/getClass method.

Ptrauth is not involved in the previous version, so isa commonwealth code is very simple, not research PACs technology related implementation, so directly look at arm64 and A12(iphoneX/XR/XS) version of the source code below:

union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };
#endif
};
Copy the code

Carry out ISA_BITFIELD, use the old version of the code formally enter isa struct analysis:

struct {
    uintptr_t nonpointer        : 1;
    uintptr_t has_assoc         : 1;
    uintptr_t has_cxx_dtor      : 1;
    uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
    uintptr_t magic             : 6;
    uintptr_t weakly_referenced : 1;
    uintptr_t unused            : 1;
    uintptr_t has_sidetable_rc  : 1;
    uintptr_t extra_rc          : 19
}
Copy the code

1. nonpointer

If nonpointer is 0, representing raw ISA, that is, the part with no structure, the ISA accessing the object will return a pointer directly to CLS, which is the type of ISA before the iPhone migrated to a 64-bit system.

If it is 1, it means it is not a pointer, it is an optimized ISA, and the information about the class is stored in Shiftcls.

I’m sure it’s always going to be 1, and I’m not going to have this struct if it’s 0;

In this example we can see that nonpointer is 1:

2. has_assoc

For details, see iOS: Associated Object.

3. has_cxx_dtor

IOS: Destructor process

4. shiftcls

Where the class object is actually stored;

ISA_MASK = 0x0000000ffffffff8ULL for arm64 architecture below A12:

As shown in the figure above, the 1 bit is exactly the position of shiftcls from 3 to 36;

So why are 33 bits used here to represent a pointer in objC? X86_64 – bit = 44

5. magic

Space used by the debugger to determine whether the current object is a real object or not initialized, fixed at 0x1A

6. weakly_referenced

Whether there are weak references, that is, whether objects are referenced by __weak;

7. unused

Unknown, should be reserved bit;

8. has_sidetable_rc

Whether there is sidetable;

When extra_RC overflows, it creates a new sidetable and moves half of its reference counters into the sidetable. This flag bit, if present, indicates that the values in the sideTable need to be considered when calculating the reference counter;

9. extra_rc

Its meaning is: object reference count;

Note that the meaning of extra_rc is version-specific:

  1. Objc4-818.2: EXTRA_rc = Reference counter
  2. Before OBJC4-818.2: extra_rc = true reference counter -1 (true reference counter = EXTRA_rc + 1);

ObjC4-818.2

ObjC4-756.2

As a match, in the objc_object::initIsa() method of objC4-818.2, set extra_rc to 1;

Here’s an example running on iOS12:

As shown above, with code prior to 818.2, obj1’s reference counter is clearly 3, so extra_rc = 3-1 = 2;

Obj2 = 1;

As shown, the extra_rc of obj2 = 1-1 = 0, and is not referenced by __weak object, so Weakly_referenced is 0;

At this point, the basic knowledge of ISA is all sorted out. Next, the detailed implementation of SideTable is studied.

(Note: image from style_Month blog)

Third, add

1. Use sidetable

See (iOS: SideTable) [www.jianshu.com/p/e0b03d2be]… ;

2. Isa and superClass pointing

I’ve talked too much about this stuff, and there are a lot of related interview questions, so I won’t repeat them here. The relationship is shown as follows:

The key points to focus on here are:

  1. Isa Pointers to all metaclass objects point to the following metaclass;
  2. The superClass pointer to the root metaclass (Meta_NSObject) points to the superClass object (NSObject);
  3. SuperClass pointing to NSObject is nil;

The interview questions extended by the above three points are no longer repeated, really forget, look at the source code or write code to do the test is the most direct, rote memorization of some interview questions, personal feeling is not too significant ~~~

There is no superClass stored in the instance object, and its information is stored in the class object/metaclass. When a method call is made using the super keyword, it essentially calls super_objcMsgSend();

Four, doubt

1. Why is Shiftcls number 33 and 44?

According to the comments in ISA’s Shiftcls, different architectures have MACH_VM_MAX_ADDRESS, the maximum address of virtual memory, and the 33 – and 44-bit designs are based on this maximum address.

After a rough calculation of MACH_VM_MAX_ADDRESS versus the maximum 33/44 bits, shiftcls still doesn’t fully cover the maximum address. Guessing the largest bits of virtual memory has other uses. The size of the shiftcls is related to the maximum virtual memory size of different architectures

2. Why change extra_RC’s calculation logic in ObjC4-818.2?

Doubt ~ ~