2021-09-06

preface

Recently, I read Lippman’s Inside the C++ Object Model, which is quite deep and easy to forget. Write some notes and mnemonic. The book feels more like a mashup of technical blog posts than a book, with some leaps of thinking and no C++ Primer’s collegial sense of sequence. And personally feel the Chinese typesetting and translation effect is very general, so some of the content is better compared with the English version. In-depth exploration of the C++ object model is not a book that needs to be read only once. This article is the first round of reading notes.

NOTE: in view of my C++ language is still only at the entry level rookie level, in the amount of code is not a lot of precipitation, so if there is any mistake in the article, welcome to correct.

Chapter one: About objects

This chapter focuses on the rough memory layout of C++ objects.

1.1 basis

The first is the simplest layout of class objects, regardless of inheritance, debuggable with the following simple code:

/ / / the class definition
class Object {
public: //section1
    uint64_t longNo2;
public: //section2
    uint64_t longNo1;
    uint8_t no1;
    uint8_t no2;
    uint8_t no3;
public: //section3
    uint8_t no4;
};

/// debug the code
void main(a) {
    Object obj = Object(a);size_t sizeObj = sizeof(Object);
}
Copy the code

NOTE: My development debugging environment: operating system macOS Big Sur; Integrated Development Environment Xcode 12.5; Compiler clang 12.0.5; Simulator: iPhone 11 (iOS 14.5).

SizeObj is 24. A few observations can be made by simply modifying the above code:

  • thesection2Move to thesection1Above (reorder the access sections), outputsizeObj32;
  • theno2Move to thelongNo1Above (adjust the order of member variables), outputsizeObj32;

With breakpoint debugging, Use a similar Po (int) (& (obj. No2) – (uint8_t *) (& (obj. LongNo1))) or Po (int) (& (obj. No2) – (uint8_t *) (& obj)) LLDB Command to observe the relative offset of the memory address of a member variable. We found that the layout of member variables for C++ objects is very similar to the layout of STRUCt objects in C, with the same consideration for memory byte alignment and the same arrangement in the order in which members are declared (more on this in Chapter 3).

For further validation of the Memory layout, use LLDB’s X command (see documentation for Examining Memory) to print the Memory layout of OBJ. For the convenience of distinguishing each Memory area, attach constant values to all objects and debug as follows:

// Class definition (after modification)
class Object {
public:
    uint64_t longNo2 = 1;
public:
    uint8_t no2 = 12;
    uint64_t longNo1 = 2;
    uint8_t no1 = 11;
    uint8_t no3 = 13;
public:
    uint8_t no4 = 14;
};
Copy the code

X / 32b&obj is used to print the memory of the OBj object, where B indicates that the memory space is divided by bytes and 32 indicates that the memory space is 32 bytes. The output is as follows: First, the memory space starts with 0x7FF, which is basically the stack space (allocated from high to position). Second, there is a lot of Padding in memory space with a value of 0x00 for memory alignment purposes. Note that in the first 8 bytes, the low saves the low, so the current debugging platform is in little-endian mode.

0x7ffee5b02060: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee5b02068: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee5b02070: 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee5b02078: 0x0b 0x0d 0x0e 0x00 0x00 0x00 0x00 0x00
Copy the code

X /8b &sizeObj prints the memory size of sizeObj. The value of sizeObj is 32 (0x20 in hexadecimal), which fits nicely. Also, notice that sizeObj has a memory address of 0x7ffee5b02058, which is 8 bytes smaller than the memory address of obj. 8 bytes is exactly the sizeof sizeObj (sizeof(size_t)).

0x7ffee5b02058: 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Copy the code

As you can see from the debugging process above, the memory space of the Object Object built above is in stack space. Modify the debug code slightly to build the Object as new. When compiled and run, Po objPtr prints an objPtr pointer (representing the memory address to which objPtr points). The output is 0x0000600003eAD360, at which point the constructed Object exists in the memory heap space.

// debug code (after modification)
void main(a) {
    Object obj = Object(a);size_t sizeObj = sizeof(Object);
    
    Object *objPtr = new Object;
    size_t sizeObjPtr = sizeof(objPtr);
}
Copy the code

X /32b objPtr = x/32b objPtr = x/32b objPtr = x/32b objPtr Because all the data on the valid bytes is correct.

0x600003ead360: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x600003ead368: 0x0c 0x00 0x60 0x00 0x00 0x00 0x00 0x00
0x600003ead370: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x600003ead378: 0x0b 0x0d 0x0e 0x63 0x65 0x00 0x00 0x00
Copy the code

NOTE: If you build an Object with a parenthesized Object(), for example, Object *obj = new Object(), the Padding will be all 0x00. Similarly, if you used Object obj in the previous debugging code; Implicitly initialized, the Padding is almost never all 0x00.

1.2 inheritance

The above is just the most basic case, and then consider the inheritance case:

/// Base class definition
class Object {
public:
    uint64_t longNo2 = 1;
public:
    uint8_t no2 = 12;
    uint64_t longNo1 = 2;
    uint8_t no1 = 11;
    uint8_t no3 = 13;
public:
    uint8_t no4 = 14;
};

// subclass definition
class SubClassObject : Object {
public:
    uint64_t subLongNo1 = 3;
    uint8_t subNo1 = 15;
};

/// debug the code
void main(a) {
    Object obj;
    size_t sizeObj = sizeof(Object);
    
    SubClassObject subobj;
    size_t sizeSubobj = sizeof(SubClassObject);
}
Copy the code

Print sizeSubobj as 48, x/48b &subobj print memory space, it is not difficult to find that subclass member variables are directly concatenated after the base class member variables memory space.

0x7ffeebedb018: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeebedb020: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeebedb028: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeebedb030: 0x0b 0x0d 0x0e 0x00 0x00 0x00 0x00 0x00
0x7ffeebedb038: 0x0f 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeebedb040: 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Copy the code

Will swapping the order of subNo1 and subLongNo1 cause sizeSubobj to change in subclass definition code? And the answer is that it will change. The sizeSubobj is 40. In other words, the member variable of an instance of a subclass can start with the Padding at the end of the base class member variable’s memory space. X /40b &subobj prints the following memory space:

0x7ffeea098020: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeea098028: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeea098030: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeea098038: 0x0b 0x0d 0x0e 0x03 0x00 0x00 0x00 0x00
0x7ffeea098040: 0x0f 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Copy the code

The above only debugged the effect of a class’s instance variable definition on the memory layout of an object, but what about class variables (static member variables)? Class variables do not affect the memory layout of the objects of the class, which are stored in static areas of memory (also known as data segments in the book).

Member functions other than Virtual also do not affect the memory layout of the class’s objects. The code of the function itself is stored in the code area of memory (under the category of text segment), and the address of the function does not need to be stored in the class instance. Because it is not needed, it is the responsibility of the compiler to find the corresponding function pointer when an instance or class calls a method (more on this later).

NOTE: C++ has a wide variety of compilers, especially non-standard C++ compilers, so some of the conclusions drawn through debugging in this article are only guaranteed to be true for Clang compilers. This is probably one of the reasons why many people think C++ is a “disgusting” language.

1.3 the virtual

The presence or absence of virtual functions in the inheritance chain of a class directly affects the size of an instance of the class. The general rule is as follows: If a type declares virtual functions, then all derived classes that directly or indirectly inherit the type need to maintain a Virtual table, which holds Pointers to virtual functions. In this case, instances of the class that declares virtual functions and instances of all derived classes of that class need to be maintained. 8 bytes of memory space (64-bit machine) is required to store the virtual table memory address, which is the VPTR pointer.

Debug with the following code, where:

  • sizeof(Object)32, because there is no virtual function on the inheritance chain, there is no need to save the VPTR pointer;
  • sizeof(MidObject)Is 40, because it declared the virtual function, so it needs to save the VPTR pointer;
  • sizeof(SubClassObject1)andsizeof(SubClassObject2)They’re both 40 because of the parent classMidObjectThe virtual function is declared, so you need to save the VPTR pointer (whether or not the virtual function is implemented).
class Object {
public:
    uint64_t longNo2 = 1;
public:
    uint8_t no2 = 12;
    uint64_t longNo1 = 1;
    uint8_t no1 = 11;
    uint8_t no3 = 13;
public:
    uint8_t no4 = 14;
};

class MidObject : Object {
public:
    virtual void vfun(void) {
        // Abstract Method}};class SubClassObject1 : MidObject {
public:
    void vfun(a) {
        printf("implement one"); }};class SubClassObject2 : MidObject {

};

/// debug the code
void main(a) {
    Object obj = Object(a); MidObject midObj =MidObject(a); SubClassObject1 subObj1 =SubClassObject1(a); SubClassObject2 subObj2 =SubClassObject2(a); }Copy the code

So, what does the Virtual Table probably hold? You can also explore this briefly with the x command of the LLDB. The memory space for midObj, subObj1, and subObj2 is printed in the first 8 bytes, meaning that the Clang compiler stores VPTR in the first 8 bytes of the instance.

(lldb) x/32b &midObj
0x7ffeedfab038: 0x40 0x50 0xc5 0x01 0x01 0x00 0x00 0x00
0x7ffeedfab040: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfab048: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfab050: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(lldb) x/32b &subObj1
0x7ffeedfab010: 0x90 0x50 0xc5 0x01 0x01 0x00 0x00 0x00
0x7ffeedfab018: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfab020: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfab028: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(lldb) x/32b &subObj2
0x7ffeedfaafe8: 0xd0 0x50 0xc5 0x01 0x01 0x00 0x00 0x00
0x7ffeedfaaff0: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfaaff8: 0x0c 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffeedfab000: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Copy the code

The memory addresses of VPTR (virtual table) of midObj, subObj1, and subObj2 are 0x00000101C55040, 0x00000101C55090, and 0x00000101C550D0 respectively. Use x/g to print out the contents of the 8 bytes of memory starting from the above three addresses:

(lldb) x/g 0x00000101c55040
0x101c55040: 0x0000000101c52e10

(lldb) x/4g 0x00000101c55090
0x101c55090: 0x0000000101c52e60

(lldb) x/4g 0x00000101c550d0
0x101c550d0: 0x0000000101c52e10
Copy the code

The size of the three memory segments should be text segment or data segment. And the first and third values should be equal, most likely virtual function address, one by one test sure enough.

(lldb) po (void(*)(void))0x0000000101c52e10
(CPPModelDemo`MidObject::vfun() at ViewController.mm:30)

(lldb) po (void(*)(void))0x0000000101c52e60
(CPPModelDemo`SubClassObject1::vfun() at ViewController.mm:37)

(lldb) po (void(*)(void))0x0000000101c52e10
(CPPModelDemo`MidObject::vfun() at ViewController.mm:30)
Copy the code

The contents of each group of data basically include:

  • A function pointer to the implementation of a virtual function;
  • libc++abi.dylibThe code ofvtable for __cxxabiv1::__vmi_class_type_info + 16Address;
  • Pointer to the type information name of the preceding class, usedpo (char *)0x0000000101c54e18You can type a string;

It is better understood that a virtual table with a class object that implements the virtual function vfunc holds Pointers to its own implementation of vfun. The virtual table of the class object that does not implement VFun holds the function pointer to the VFun implemented by a class in the inheritance chain (in this case, the VFun implemented by MidObject).

Fortunately, I stumbled across a bit of virtual implementation of the Clang compiler.

Since virtual functions have the special feature of introducing virtual tables, what about virtual inheritance? Is it more special? It does make it more special. So that there is no way to introduce and explore before reading the content behind. In addition, virtual functions in multiple inheritance, virtual inheritance scenarios, the compiler resolution process for each virtual function address involved in the type is also more complex, which will be explored later in the article.

NOTE: There are also compiler differences in the Virtual table memory layout estimation. The memory layout of the Virtual table compiled by the Clang compiler may not be consistent with that of other compilers.

1.4 summary

This chapter briefly debuts some of the features of the C++ model and finds some similarities between its memory layout and the Objective-C Runtime implementation, such as the behavior of requiring byte alignment and inheritance-triggered object size expansion.

But for developers who have seen the Objective-C Runtime, there is a question. Objective-c has isa for object types. C++ doesn’t need that? Indeed, C++ does not. C++ implementation of object-oriented, in fact, C++ compiler strong intervention as the premise. This is why C++ code is more likely to encounter compilation errors related to type checking, especially when the rules are not strong, because C++ needs to establish a lot of specific type information during compilation, such as the offset of a member variable in the memory space of the object, the specific address of the member function of the class, This type information is rarely deferred to runtime resolution (and is not required), so C++ native has no reflection mechanism.

Take the following debugging code as an example:

void main(a) {
    Object obj = Object(a); MidObject midObj =MidObject(a); SubClassObject1 subObj1 =SubClassObject1(a); SubClassObject2 subObj2 =SubClassObject2(a);// Debug code 1: non-virtual mechanism
    MidObject *midObjPtr = (MidObject *)&midObj;
    Object *baseObjPtr = (Object *)&midObj;
    
    // Debug code 2: Virtual mechanism
    SubClassObject1 *subObjPtr1 = &subObj1;
    MidObject *midObjPtr = (MidObject *)subObjPtr1;
    midObjPtr->vfun(a);// Print the result: implement it
}
Copy the code

As shown in debug code 1 above, using LLDB Po sizeof(*baseObjPtr) and Po sizeof(*midObjPtr) yields 32 and 40, respectively, even though baseObjPtr and midObjPtr point to the same memory address, But the compile phase actually marks it as two types, so the object types that the runtime baseObjPtr and midObjPtr point to are the types declared at compile time. This seems pretty incredible to objective-C developers.

This shows that C++ is a very “static” language without the virtual mechanism, and method invocation under virtual will be more dynamic. For example, in debug code 2 above, C++ types become runtime dynamic when method calls use the virtual mechanism.

Chapter 2: Constructor semantics

This chapter explores the compiler’s interference in object construction and its impact on code form and execution efficiency.

2.1 Default constructor

If a C++ class does not define any constructors (with/without parameters), the C++ compiler automatically synthesizes a default constructor for that class. For example, the ListNode class:

class ListNode {
public:
    int val;
    ListNode *next;
};
Copy the code

The book argues that the default constructor synthesized by the compiler in this case is trivial. The following two lines of code to construct a ListNode for node1 use a random and unexpected constructor that would be trivial. The values of node2’s member variables, node2.val, are 0 and node2.next, NULL, as expected, and the constructor used for this code is nontrivial. So for the Clang compiler, the following two constructs are different.

// Debug code 1
ListNode node1;
// Debug code 2
ListNode node2 = ListNode(a);Copy the code

But is Clang really going to bother building two default constructors? Probably not. The reason for the difference is that ListNode node1 triggers the compiler to generate code that says “allocate sizeof(ListNode) memory on the stack”. This sentence does not initialize, but only allocates memory. ListNode node2 = ListNode() allocate sizeof(ListNode) memory on the stack; 2. “Initialize this memory by calling the default constructor of ListNode() synthesized by the compiler.”

Write a simple piece of code to debug and verify the above conclusion. First, let’s look at the implementation principle of method 1:

class A {
    int a;
};

void main(a) {
    int i = 1;
    A a;
    int j = 2;
}
Copy the code

The assembly language corresponding to the three lines of code intercepted from main during debugging is:

0x100e15e94 <+4>:  movl   $0x1.-0x4(%rbp)
0x100e15e9b <+11>: movl   $0x2.-0xc(%rbp)
Copy the code

It is strange that three sentences of C code should be implemented using only two assembly codes. Line by line parsing:

  • Frame offset in current stack-0x4Is written tointconstant0x1(occupy-0x4 ~ -0x1Between 4 bytes);
  • Frame offset in current stack-0xcIs written tointconstant0x1(occupy-0xc ~ -0x9Between 4 bytes);

So the four bytes that are free are naturally left for object A, four bytes, which is exactly equal to sizeof(a). Nice and simple. It turns out that allocating memory on the stack is so easy. Next, the realization principle of the object constructed by debugging mode two is as follows:

class A {
    int a;
};

void main(a) {
    int i = 1;
    A a = A(a);int j = 2;
}
Copy the code

Similarly, the assembly code corresponding to the three lines of code intercepted from the main function is as follows, which is obviously much longer. Line by line parsing:

  • The correspondingint i = 1;
  • Read the stack frame-0x28The offset memory address (the start address of the allocated memory block) is writtenrcxRegister;
  • willrcxWrite to the memory address saved inrdiRegister;
  • The constant0x4writeedxRegister;
  • willrcxThe saved contents are written to the stack frame-0x38Offset memory address according to the context of the complete assembly code at this timercxRegisters actually hold values'\ 0';
  • callmemsetIn fact, the above several instructions are alreadymemsetComplete the pass, this sentence is the specific meaning of the stack frame-0x28The offset memory address is filled with the first 4 bytes of memory'\ 0';
  • The correspondingint j = 2;
0x106648ebc <+60>:  movl   $0x1.-0x24(%rbp)
0x106648ec3 <+67>:  leaq   -0x28(%rbp), %rcx
0x106648ec7 <+71>:  movq   %rcx, %rdi
0x106648eca <+74>:  movl   $0x4, %edx
0x106648ecf <+79>:  movq   %rax, -0x38(%rbp)
0x106648ed3 <+83>:  callq  0x106649426               ; symbol stub for: memset
0x106648ed8 <+88>:  movl   $0x2.-0x2c(%rbp)
Copy the code

The summary of the actual operation of Clang compiler is basically consistent with the previous conjecture, but it is disappointing to see that there is no assembly instruction that calls the compiler to synthesize the default constructor through the callQ instruction. The reason should be that the compiler to synthesize the default constructor exists in the way of inline in this scenario. So it’s translated directly into a set of assembly instructions that are inserted directly into the code.

NOTE: If A needs to consider the virtual mechanism, for example, virtual inheritance exists in the inheritance chain, or virtual function is declared in its own/inheritance chain, the callQ instruction will be seen in the corresponding assembly code, because the logic to set the value of VPTR has A certain complexity, which is not suitable for inline.

Going back to the four scenarios outlined in the book where the C++ compiler generates the nontrivial default constructor:

  • The class of a member variable explicitly defines the default constructor;
  • One of the base classes explicitly defines the default constructor;
  • Defines a virtual function;
  • Virtually inherits a base class;

After debugging one by one, the Clang compiler does not handle exactly as described in the book, with the following rules:

  • When the default constructor is declared without explicitly calling it (similar to the above)ListNode node1) : called if there is an explicit declaration of the default constructor in the inheritance chain, not if there is no declaration. For member variables of type class (not Pointers, but instances of class), the default constructor is called if it is explicitly declared, and not if it is not declared (the default constructor is not called when the class itself and the member variable recurse downward). In this case, the synthesized default constructor would be truly trivial);
  • Declare and explicitly call the default constructor (similar to above)ListNode node2 = ListNode()The default constructor synthesized by the compiler is called if there is an explicit declaration of the default constructor in the inheritance chain, in which case the synthesized default constructor containsmemsetThe operation to clear memory should therefore be nontrivial. For member variables of type class, the default constructor is explicitly declared and the default constructor synthesized by the compiler is called.

This part of the book can be summed up in one sentence: The default constructor synthesized by the compiler can be considered nontrivial as long as it does something meaningful (such as initializing some of the blocks, including assigning a VPTR pointer to an object or initializing a specific block of data set to implement virtual inheritance).

2.2 Copy constructor

If a C++ class does not define any copy constructors, the C++ compiler automatically synthesizes the copy constructor for that class. C++ generally copies objects in one of two ways: 1. Initialize using the copy constructor (to copy arguments to the target object); Use assignment operator overloading (copy an rvalue object to an lvalue). This chapter focuses on the first.

In addition, the C++ copy constructor supports bitwise copy semantics. For the following Word types, for example, the compiler will not synthesize the copy constructor because the bitwise copy constructor (Word(const char*)) is defined, so the compiler determines that “the copy constructor already exists and does not need to be synthesized automatically by the compiler.”

class Word {
public:
    Word( const char*); ~Word() { delete [] str; }
private:
    int   cnt;
    char *str; 
};
Copy the code

There are four cases where bitwise Copy semantics are not used:

  • A copy constructor is declared for a class that is a member of a class.
  • Inherits the base class with the copy constructor;
  • One or more virtual functions are defined (because missetting the object VPTR can have very serious consequences);
  • There is one or more virtual base classes in the inheritance chain (for similar reasons);

But under the Clang compiler, is that true? I don’t think so. For example, M obviously does not fall into the above four cases and declares a bitwise copy constructor. However, in the following debugging code, both lines of code can be compiled, which means that the Clang compiler does not treat M (const char * STR) as a copy constructor. So the M (M M) copy constructor is synthesized automatically.

class M {
private:
    char a;
public:
    M (const char *str) { a = str[0]; }};void main(a) {
    M m = M("abc");
    M mcpy(m);
}
Copy the code

NOTE: The above situation is in debug code, if add M M; The M (const char * STR) constructor is already defined and the compiler does not need to synthesize it automatically. Further, if M (const char * STR) {a = STR [0]; } mask, then M MCPY (M); And M M; Both can be compiled, that is, the compiler synthesizes both the default constructor and the copy constructor.

Here’s how the C++ compiler handles object replication when virtual mechanisms need to be considered. Use the following code for debugging. Conclusion: Under Clang compiler, when virtual inheritance exists in the inheritance chain, the class instance size will need an additional 8 bytes (excluding the necessary Padding) to store virtual inheritance related information, and the debugging data found that virtual inheritance related information is also stored in the Virtual table. Notice the second sentence of debug code A A (c), where the copy constructor is called to build an instance of A from the instance of C, where the problem of copying data from C to A is involved. In this case, the compiler will exclude the VPTR pointer in the C instance and copy only the member variable data corresponding to the A type in C to the A instance.

class A {
public:
    int a;
};

class C : public virtual A {
public:
    int c;
};

void main(a) {
    C c = C(a);A a(c);
}
Copy the code

Further, if virtual void vfun() {} is declared in A, then both object A and object C contain VPTR. In this case, the compiler writes “address of the virtual table bound to type A” to instance A and copies c’s member variable data to instance A.

NOTE: does the above phenomenon contradict C++ polymorphism? No, because this is a conversion of object types, which shows that C++ is a statically typed language. If A * APtr = &c is declared, then the APTR pointer is polymorphic because it can point to objects of type A or of A derivative type.

2.3 Semantics of program transformation

This chapter covers some of the implicit conversion processing of the C++ compiler.

2.3.1 the participation

First of all, let’s look at the processing details in the process of function parameter passing. The debugging code is as follows, following the definition of A and C above. Func1, func2, and main are used as breakpoints to view the memory addresses of c variables in the func1, func2, and main functions. 0x00007FFEE4186060, 0x00007FFEE4186070, and 0x00007FFEE4186070 respectively. Obviously the C argument to func2 and the C variable of main have the same memory address, which means that the compiler made a temporary copy of the C object to Func1 before main called func1, whereas func2 passed the argument by reference directly without the copy operation. So the efficiency of Func1 is significantly lower than that of Func2.

class A {
public: int a;
};

class C : public virtual A {
public: int c;
};

void func1(C c) {
    printf("implement it\n");
}

void func2(const C &c) {
    printf("implement it\n");
}

void main(a) {
    C c = C(a);func1(c);
    func2(c);
}
Copy the code

2.3.2 return value

And then we look at what the compiler does with the return value, same thing. Use LLDB Po &c to check the memory address of c variable in func and main, both 0x00007FFEE9d72070, that is, even though the function returns an object, this process does not introduce redundant object copy process.

class C {
public:
    uint64_t a = 15;
    virtual void vfunc(a) {}; };C func(a) {
    C c;
    return c;
}

void main(a) {
    C c = func(a); }Copy the code

The reason for this is that the compiler takes the above code a step further. The processing process is as follows:

  • willfunctransformedvoid __func(C &c)Form;
  • call__func()Pass one by referenceCObject;
  • __func()Internally build a temporary oneCObject and uses the copy constructor to copy the data of the temporary object to the parameter passed by referenceCObject;

The above compiler processing appears to involve the generation of temporary objects, but in fact there are redundant operations. Then debug to see if Clang does the same thing. Look at the core code in the main function that triggers the func function call. The first sentence is to read stack frame-0x10 offset memory address write rDI register; The second is to call the func function.

0x10ac7fe78 <+8>:  leaq   -0x10(%rbp), %rdi
0x10ac7fe7c <+12>: callq  0x10ac7fe20               ; func at ViewController.mm:24
Copy the code

Continue to look at the core code of func. The memory address written to the RDI register in main plays a crucial role here. Ignore the first four instructions and print the contents of the memory address stored in the RDI register before and after the callQ (p/x $RDI views the contents of the register). It is clear that callq initializes a C object in the memory address stored in the RDI register, and that it calls the no-argument constructor instead of the one described in the book. It is clear that the Clang compiler is more concise and efficient.

0x10ac7fe28 <+8>:  movq   %rdi, %rax
0x10ac7fe2b <+11>: movq   %rdi, %rcx
0x10ac7fe2e <+14>: movq   %rcx, -0x8(%rbp)
0x10ac7fe32 <+18>: movq   %rax, -0x10(%rbp)
0x10ac7fe36 <+22>: callq  0x10ac7fe50               ; C::C at ViewController.mm:18
Copy the code

The operation logs of the debugging process are as follows:

(lldb) p/x $rdi
(unsigned long) $1 = 0x00007ffeee3e3060
(lldb) x/2g 0x00007ffeee3e3060
0x7ffeee3e3060: 0x0000000000000000 0x0000000000000000
(lldb) x/2g 0x00007ffeee3e3060
0x7ffeee3e3060: 0x000000010181d030 0x000000000000000f
Copy the code

Later in the book, it turns out that this is the compiler layer’s performance optimization for “Return object by Value” functions, also known as NRV (Named Return Value). NRV optimization refers to the transformation of the function form of the object returned by value to declare (only declare without initialization) the original return value object in the calling function, and pass the original return value object to the called function in the form of reference and parameter, so as to avoid redundant object construction or copy operations.

The NRV optimization also improves the performance of the copy constructor itself, because the copy constructor is also a function that returns an object by value. The optimized behavior of the NVR is very similar to that of the func function handled by the Clang compiler above, except that additional data copy logic is added. The modified copy constructor signature can be assumed to be roughly void C(C & DST, const C & SRC).

In addition, because copy constructors are very consistent, there is no need for developers to explicitly define copy constructors for classes in most cases. If you have to define your own, if you implement copy constructors with fancy memsets, memcpy, etc., then you need to consider assigning the VPTR correctly under virtual. In short, it is best not to explicitly define the copy constructor.

2.4 Initializing the list

This section describes some of the pitfalls of initializing lists using constructors, meaning that it is important for developers to understand the details of how the compiler handles initializing lists. There are three scenarios where it is best to use initializer list syntax when defining constructors:

  • Initialize a member variable of a reference type.
  • Initialize a constant member variable.
  • The argument constructor of a class that calls a base class or a member variable;

In case one, the reference type must be initialized at declaration time, except for member variables in a class. You can declare only member variables that reference the type without specifying a default value (for example, int &a;). — Undefined symbols for architecture x86_64: “C:: A “, referenced from:… “Or”. In this case, there are only two options: you can either specify an initial value when declaring a member variable that refers to the type in the class; Either initialize the reference member variable in the constructor initializer list.

Assigning a const constant to any function body raises a compilation error. Can’t assign to… Such a compilation error. There are only two options in this case: either you specify an initial value when a const member variable is declared in the class; Either initialize the const member variable in the constructor initializer list.

In case three, implementing the equivalent of an initializer in a constructor instead of using the initializer does not cause any compiler warnings or errors, but this approach seems silly and, importantly, inefficient. Note that A() {} in A must have, Constructor for ‘C’ must be explicitly initialize the member ‘a’ which does not have a default Constructor error

class A {
public:
    int a;
    
    A() {}A(intx) { a = x; }};class C {
public:
    int c;
    A a;
    
    // Initialize a in the body of the constructor
    C() {
        a = A(255);
        c = 1;
    }
    
// // Method 2: Initialize a using the initialization list
// C() : a(255) {
// a = A(255);
// c = 1;
/ /}
};

void mainn(a) {
    C c;
}
Copy the code

To see the implementation of the C() constructor defined in the above two ways, the corresponding assembly code is as follows: the first paragraph corresponds to method 1, and the second paragraph corresponds to method 2. Obviously, method one calls two constructors, while method two calls only once. With breakpoint debugging, the constructors called in method 1 are the default constructors A() and A(int x).

A pair of c.A processing actually consists of three steps:

  • callA()Initialize thec.aMemory space;
  • Assign on the stackAType temporary object in memory, and callA(int x)Initialize temporary objects;
  • Copies the temporary object’s data toc.aMemory space;

A(int x) initializes the memory space of c.a. It’s not obvious because type A itself is simple. What if type A is complex, or requires complex destructor operations? The redundant c.a initialization, temporary object construction, and temporary object destruction introduced in approach 1 will become a considerable burden.

NOTE: There may be some doubt, method 2 only missing two instructions, the core code missing two steps? That’s the way it is. The majority of the code is memory addressing and stack frame handling, and the core code is minimal.

0x105ccbe50 <+0>: pushq %rbp ; $rdi == &c, the caller passes in the C address via rdi0x105ccbe51 <+1>:  movq   %rsp, %rbp
0x105ccbe54 <+4>:  subq   $0x20, %rsp
0x105ccbe58 <+8>:  movq   %rdi, -0x8(%rbp)          ; $rdi == &c, -0x8(%rbp) = &c
0x105ccbe5c <+12>: movq   -0x8(%rbp), %rax          ; $rax == &c
0x105ccbe60 <+16>: movq   %rax, %rcx                ; $rcx = &c
0x105ccbe63 <+19>: addq   $0x4, %rcx                ; $rcx = &c.a
0x105ccbe6a <+26>: movq   %rcx, %rdi                ; $rdi = &c.a
0x105ccbe6d <+29>: movq   %rax, -0x18(%rbp)         ; -0x18(%rbp) = &c
0x105ccbe71 <+33>: callq  0x105ccbea0               ; A::A at ViewController.mm:22, initialize c.A0x105ccbe76 <+38>: leaq   -0x10(%rbp), %rdi         ; $rdi = -0x10(% RBP), which allocates the temporary object __tempa on the stack; Memory, address is-0x10(% RBP), and pass the parameter through the RDI register0x105ccbe7a <+42>: movl   $0xff, %esi ; The ginseng0xff
0x105ccbe7f <+47>: callq  0x105ccbec0               ; A::A at ViewController.mm:24Initializes __tempa0x105ccbe84 <+52>: movl   -0x10(%rbp), %edx         ; $edx = &__tempa
0x105ccbe87 <+55>: movq   -0x18(%rbp), %rax         ; &rax = &c
0x105ccbe8b <+59>: movl   %edx, 0x4(%rax) ; C.a = __tempa, which copies data from __tempa to &c.A0x105ccbe8e <+62>: movl   $0x1, (%rax)
0x105ccbe94 <+68>: addq   $0x20, %rsp
0x105ccbe98 <+72>: popq   %rbp
0x105ccbe99 <+73>: retq
Copy the code
0x104731e90 <+0>:  pushq  %rbp
0x104731e91 <+1>:  movq   %rsp, %rbp
0x104731e94 <+4>:  subq   $0x10, %rsp
0x104731e98 <+8>:  movq   %rdi, -0x8(%rbp)
0x104731e9c <+12>: movq   -0x8(%rbp), %rax
0x104731ea0 <+16>: movq   %rax, %rcx
0x104731ea3 <+19>: addq   $0x4, %rcx
0x104731eaa <+26>: movq   %rcx, %rdi
0x104731ead <+29>: movl   $0xff, %esi
0x104731eb2 <+34>: movq   %rax, -0x10(%rbp)
0x104731eb6 <+38>: callq  0x104731ed0               ; A::A at ViewController.mm:24
0x104731ebb <+43>: movq   -0x10(%rbp), %rax
0x104731ebf <+47>: movl   $0x1, (%rax)
0x104731ec5 <+53>: addq   $0x10, %rsp
0x104731ec9 <+57>: popq   %rbp
0x104731eca <+58>: retq 
Copy the code

In addition, the compiler processes the initializer in a particular order, in the order in which the member variables are declared in class, not in the order specified in the initializer. For example, even if the constructor above is declared C() : A(255), C(1) {}, the compiler will still initialize C first and then A in C(), since both variables are declared in the order C first and then A. Finally, the compiler inserts the operations specified in the constructor’s initializer before all code in the constructor’s body is executed, that is, the initialization list operation is guaranteed to take precedence.

Finally, there is no performance difference between initializing C members of class C from the initializer list and initializing C members in the function body. There is no need to define the above constructor C() : A(255), C(1) {} in this way, because it is difficult to perceive the difference in the compiler’s handling of the two member initializations. Getting into the habit of doing only what is necessary during development can reinforce this impression of compiler diversity.

NOTE: C++ compilers always like to do things quietly.

2.5 summary

In this chapter, the author analyzes the C++ compiler to deal with some details of constructors, these details can also be applied to specific business development, such as the principle of NRV optimization, as far as possible to avoid passing objects by value, it is very reference. C++ compilers are inherently highly variable, and if developers optimize their programs at the business code level, then platform-specific C++ compilers do not need to optimize these mechanisms.

Chapter 3: Data semantics

This chapter introduces C++ member variable, static member variable implementation principle, such as object data layout, class static member variable storage, member variable access implementation, virtual mechanism how to affect the object memory layout and so on. In fact, the first chapter in the actual debugging process has been uncovered in advance of a lot of this information, this chapter will dig into more details.

3.1 Binding of member variables

Look at the following routine. Note that on older versions of the compiler (so old that it is almost impossible to see now), the x variable in the body of the inline float getX() function is bound to extern float x, causing the compiler to misunderstand it: When the inline function float getX() is defined, the float x member variable is not yet declared. However, later versions of the compiler avoid this error by specifying that variable bindings for inline functions occur after the class declaration is complete.

extern float x;

class Point3d {
pubic:
    Point3d(float.float.float);
    float getX(a) const { return x; }
private:
    float x, y, z;
}
Copy the code

Older versions of C++ code tried to avoid this error by introducing two rules that persist (perhaps for readability and consistency) even though today’s C++ compilers are generally fine:

  • Put all member variable declarations inclassAt the beginning, make sure to bind correctly;
  • Put all theinlineAll function definitions are inclassBeyond the statement;

That is, the transformation is as follows:

extern float x;

class Point3d {
private:
    float x, y, z;
pubic:
    Point3d(float.float.float);
}

inline float Point3d::getX(a) const { return x; }
Copy the code

3.2 Object Layout

In fact, some conclusions can be drawn about C++ object layout:

  • Basic layout of simple object members and C languagestructBasically consistent, following the byte alignment principle;
  • The subclass inherits the parent class, directly on the parent class object layout, expand the required storage space of the member variables defined by the subclass.
  • For objects involving the virtual mechanism, an additional 8 bytes are required to store the address of the virtual table.

So a C++ object contains five components, although the inclusion of a fifth component depends on the compiler. Some compilers separate the virtual inherited class from the storage space corresponding to the main virtual inherited class. If a pointer is added to the storage space corresponding to the main virtual inherited class, the compiler will compile the object containing the following fifth component. Some compilers, such as the Clang compiler, compile objects that do not contain the fifth component by directly combining the storage space corresponding to the virtual inherited class and the main virtual inherited class into one contiguous memory.

  • Save member variables defined by the class itself;
  • Save the member variables of the class definition on the inheritance chain;
  • Additional Padding required for byte alignment;
  • savevptrMemory space;
  • Save the memory space introduced to implement virtual inheritance;

When a member variable is declared static, its memory address (relative address) is determined at compile time. After the program is loaded into memory, it occupies a fixed memory location in the static section of the data segment.

NOTE: Static variables can be stored in either the static section of a data segment or the BSS segment, if compile-time and run-time are different. However, when the program is ready to load and run, both are unified in the data Segment static section. Declare static or global variables initialized at compile time to be stored in the static section of the binary and directly increase the size of the binary. Static int s_arr[0x10000000] = {100, 101, 102}; Will directly increase the compiled product (2^28 * 4)B = 1GB. During compilation, static or global variables that are not initialized when declared are stored in the BSS section of the binary, which records basic information about the variables, so the direct increase in the size of the binary is small. For example, declare static int s_arr[0x10000000]; , the size of the compiled product is almost constant. The binary static area maps directly to virtual memory at runtime; The BSS segment formally allocates virtual memory space to the static segment at run time.

This raises the question of how the compiler can convert a line of code that accesses a member variable of an object. For example, in the following routine, there are two ways to access a member variable directly: directly and through a pointer (the two are different in the handling of polymorphism and indirectness of access).

class Point3d {
public:
    int x = 100, y = 200, z = 300;
};

void main(a) {
    Point3d pt;
    int x1 = pt.x;
    
    Point3d *ptPtr = &pt;
    int x2 = ptPtr->x;
}
Copy the code

The corresponding assembly code is as follows

0x100f13e30 <+0>:  pushq  %rbp
0x100f13e31 <+1>:  movq   %rsp, %rbp
0x100f13e34 <+4>:  subq   $0x30, %rsp
0x100f13e38 <+8>:  leaq   -0x10(%rbp), %rdi
0x100f13e3c <+12>: callq  0x100f13e60               ; Point3d::Point3d at ViewController.mm:10
0x100f13e41 <+17>: movl   -0x10(%rbp), %eax ; Int x1 = pt.x0x100f13e44 <+20>: movl   %eax, -0x14(%rbp)
0x100f13e47 <+23>: leaq   -0x10(%rbp), %rcx
0x100f13e4b <+27>: movq   %rcx, -0x20(%rbp)
0x100f13e4f <+31>: movq   -0x20(%rbp), %rcx ; Start int x2 = ptPtr->x0x100f13e53 <+35>: movl   (%rcx), %eax
0x100f13e55 <+37>: movl   %eax, -0x24(%rbp)
0x100f13e58 <+40>: addq   $0x30, %rsp
0x100f13e5c <+44>: popq   %rbp
0x100f13e5d <+45>: retq      
Copy the code

When the compiler compiles pt.x, it first knows from the previous declaration that it is a Point3d type, And know that its address is the 12-byte (sizeof(Point3d)) memory of the offset from -0x10 to -0x5 of the current stack base (the previous assembly code % RBP register). In addition, the object memory 0 offset store x, 4 offset store Y, z offset store Z. So pt.x is translated as the data stored in the 4 bytes (sizeof(int)) that access the 0 offset address of the stack base.

PtPtr ->x = 0; ptPtr->x = 0; ptPtr->x = 0 That is, ptPtr->x has one more layer of “access pointer to address” indirection than pt.x, but both access the same data.

Think: A compiler takes a bunch of program metadata like types, function addresses, and literals, and then translates that information into machine language, doing memory and calculations back and forth to do what you’re writing high-level language code to do.

So how does C++ access static variables of a class? Here are the debugging main function and the corresponding assembly instructions for the two assignment codes. Because static variable variables of a class are implemented the same as static variable bodies, they are debugged directly in C code.

static int s_x = 100;
void main(a) {
    int x = s_x;
    int y = s_x;
}
Copy the code
0x101e99ea9 <+41>: movl   0x780d(%rip), %ecx        ; int x = s_x;
0x101e99eaf <+47>: movl   %ecx, -0x14(%rbp)
0x101e99eb2 <+50>: movl   0x7804(%rip), %ecx        ; int y = s_x;
0x101e99eb8 <+56>: movl   %ecx, -0x18(%rbp)
Copy the code

You can see that access to static variables (including static member variables of a class) or global variables is computed by the IP register address. In the case of the first assembly instruction, the compiler already knows when it compiles the assembly instruction that the runtime memory address of the assembly instruction (0x10556EEa9) will be 0x780d larger than the runtime memory address of the static variable. Since it is located through the IP register, the IP register points to the address of an executing instruction. The address of the instruction will change, while the memory address of the static variable is fixed. Therefore, the IP address offset parameters used by two compilation instructions accessing the same static variable are basically different.

3.3 inheritance

The virtual mechanism of the Clang compiler is not quite the same as described in the book. There are differences in details, but they are basically the same. For the sake of convenience, this chapter will directly follow the implementation of the Clang compiler as it is mainly explored through debugging and combining with the contents of the book.

Without considering polymorphic inheritance, simply have extension splicing, no further elaboration. The introduction of polymorphism, also known as virtual mechanism, will bring additional time and space burden to the program, mainly listed as follows:

  • We need to introduce a virtual table (space) to store the addresses of virtual functions associated with the type.
  • For each object associated with the type of virtual Table, an additional 8 bytes of memory are allocated for storagevptr;
  • Extend the constructor functionality to properly initialize the object’svptr;
  • Extend destructor functionality so that it can properly free objectsvptrMemory (vptrMemory occupied by the pointer itself);

So how does VPTR affect the memory layout of objects? The previous content, first summarize the following conclusions:

Conclusion 1: Without introducing multiple inheritance,vptrAlways stored in the first 8 bytes of the object.

3.3.1 Multiple Inheritance

Simple debugging with the following routine:

class A {
public:
    int a = 0x1;
    virtual void vfun1(a) {}; };class B {
public:
    int b = 0x2;
    virtual void vfun2(a) {}; };class C : public A, public B {
public:
    int c = 0x3;
    
    void vfun1(a) {
        printf("c vfunc1 imp\n");
    }
    
    void vfun2(a) {
        printf("c vfunc2 imp\n"); }};void main(a) {
    C c;
}
Copy the code

When compiled, Po sizeof(a) is 16 bytes, x/32b &a prints the memory of a object, and VPTR can be found. That is, after a class declares a virtual function, its objects will introduce 8 bytes of memory space, holding VPTR on 0-7 bytes, and the memory layout of member variables immediately after. Continue Po sizeof(c) is 32 bytes, x/32b &c prints c object memory, you can find that there are two VPTR! Class C defines the memory layout of member variables immediately after class A, which declares virtual functions, and class B, which also declares virtual functions. Hence the second conclusion:

Conclusion 2: When multiple inheritance is used, the spatial layout of each base class in an object contains the base class’svptr(if needed) and the layout of member variables for the base class.

The operation logs are as follows:

(lldb) po sizeof(a)
16

(lldb) x/16b &a
0x7ffee5e04060: 0x40 0xc0 0xdf 0x09 0x01 0x00 0x00 0x00
0x7ffee5e04068: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00

(lldb) po sizeof(c)
32

(lldb) x/32b &c
0x7ffee5e04040: 0x68 0xc0 0xdf 0x09 0x01 0x00 0x00 0x00
0x7ffee5e04048: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee5e04050: 0x88 0xc0 0xdf 0x09 0x01 0x00 0x00 0x00
0x7ffee5e04058: 0x02 0x00 0x00 0x00 0x03 0x00 0x00 0x00
Copy the code

NOTE: In the book, VPTR is usually placed after the layout of member variables. It is said that VPTR is conducive to preserving the layout rules of member variables prior to the introduction of polymorphism and is compatible with C language features (such as deriving a polymorphic class from a C structure). But the C++ compiler can do all of these things. Clang is also compatible with struct-derived polymorphisms, but it seems to introduce more indirectionality. I won’t go into that here, but you can follow this debugging routine if you are interested.

Class C: public A, public B, class C: public A, public B There is one. Class C: public B, public A, C: public B, C: public A, C: public B, C: public A

(lldb) po sizeof(c)
32

(lldb) x/32b &c
0x7ffee2d580a0: 0x90 0x80 0xea 0x0c 0x01 0x00 0x00 0x00
0x7ffee2d580a8: 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee2d580b0: 0xb0 0x80 0xea 0x0c 0x01 0x00 0x00 0x00
0x7ffee2d580b8: 0x01 0x00 0x00 0x00 0x03 0x00 0x00 0x00
Copy the code

So, does the Clang compiler lay out the inheritance list in strict order? And it isn’t. Masking the virtual void vfun1() {} declared in A based on the initial debugging code; Function. In this case, object A does not need to save VPTR, but object B does. In this case, the Clang compiler layouts B first and then A. The specific debug logs are as follows:

(lldb) po sizeof(c)
24

(lldb) x/24b &c
0x7ffeed0ae0b0: 0x60 0x20 0xb5 0x02 0x01 0x00 0x00 0x00
0x7ffeed0ae0b8: 0x02 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x7ffeed0ae0c0: 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Copy the code

So does the Clang compiler lay out all the base classes with VPTR and then go back and lay out the base classes without VPTR? Also is not. Masking the virtual void vfun1() {} declared in A based on the initial debugging code; Class C: public A, public B, public D In this case, Clang goes to B and then goes back to A and then goes to D. Therefore, the Clang compiler simply ensures that the first eight bytes of an object must be used to hold the VPTR in case it needs it.

class D {
public:
    int d = 0x4;
    virtual void vfun3(a) {}; };Copy the code
0x7ffee8bfd0a8: 0x68 0x30 0x00 0x07 0x01 0x00 0x00 0x00
0x7ffee8bfd0b0: 0x02 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x7ffee8bfd0b8: 0x80 0x30 0x00 0x07 0x01 0x00 0x00 0x00
0x7ffee8bfd0c0: 0x04 0x00 0x00 0x00 0x03 0x00 0x00 0x00
Copy the code

To sum up, the third conclusion is drawn:

Conclusion 3: For multiple inheritance, the compiler will pick the first band from the inheritance listvptrIs laid out first, and then the rest of the base classes are laid out in the order of the inheritance list.

3.3.2 Virtual Inheritance

Virtual inheritance is in order to solve the multiple inheritance system, a class of two inheritance chain in the upper intersection, leading to the object of the class contains two or more memory area, for the upper class layout of intersection and through virtual inheritance, can make objects with just a memory for cross class layout (or a pointer to intersect class), so as to avoid the memory in the process of object layout redundancy, And the risk of ambiguity when objects access member variables of intersecting classes.

Think: the C++ multiple inheritance system is really a mess. Not that the compiler implementation is a problem, but that it gives developers a weapon to create a messy inheritance system! In addition, the base class avoids the ambiguity of the derived class accessing the base class member variables through virtual inheritance, which seems like the upper class knows the details of the lower class.

As mentioned earlier, the object layout of the Clang compiler combines the storage space corresponding to the virtual inherited class and the main virtual inherited class into a contiguous memory. Explore Clang’s layout strategy for handling virtual inheritance with the following code, which defines a very complex inheritance structure for debugging code to capture more processing details (note that this inheritance structure does not make sense, C.x: non-static member ‘x’ found in multiple base-class subobjects of type ‘base’ That is, there is a naming conflict and the compiler cannot tell which slot the code is trying to access.

class Base {
public: int x = 0xff;
};

class A : public virtual Base {
public: int a = 0x1;
};

class B : public Base {
public: int b = 0x2;
};

class D : virtual public B {
public:
    int d = 0x4;
};

class E : virtual public Base {
public: int e = 0x5;
};

class F : virtual public E {
public: int f = 0x6;
};

class G : virtual Base {
public: int g = 0x7;
};

class H : public G {
public: int h = 0x8;
};

class C : public A, virtual public D, public F, virtual public H {
public:
    int c = 0x3;
};

void main(a) {
    C c;
    int x = c.g;
}
Copy the code

Base classes A, D, F, and H are processed according to the previous conclusion 3, and memory alignment is not considered for the time being.

Let’s start with A. The Clang compiler handles virtual inheritance as follows: If the index of the layout space of the current class is 0, the index of the common inherited base class is positive above the memory area of index 0 (memory address is negative offset), and the index of the virtual inherited base class is negative below the memory area of index 0 (memory address is positive offset). Because A directly introduces virtual inheritance, you need to specify VPTR for A. Therefore, the process derived from the A object layout in the above note is:

// A{A-vptr, 0x1}(0), 
// Base{0xff}(-)
Copy the code

And then D. In the object, the memory layout corresponding to B is relatively simple, that is, the common inheritance is superimposed as 0xFF, 0x2. Next, it should be noted that since D virtually inherits B, the index of the whole memory corresponding to B is negative, which is placed below index 0.

// D{D-vptr, 0x4}(0), 
// {Base{0xff}(+), B{0x2}}(-)
Copy the code

Now let’s look at F. F’s virtual inheritance Base class E and A have the same layout, regardless of the alignment is E{e-vptr, 0x5}(+), Base{0xFF}(-), and the whole block of E is marked negative.

// F{F-vptr, 0x6}(0),
// {E{E-vptr, 0x5}, Base{0xff}(-)}(-)
Copy the code

Then look at H. Its inherited Base classes G and A have the same layout, regardless of the alignment being G{g-vptr, 0x7}(0), Base{0xFF}(-), then the whole block of G is marked positive, and then the Base{0xFF}(-) marked negative is moved below index 0.

// 1. Determine the relative position
// {G{G-vptr, 0x7}, Base{0xff}(-)}(+),
// H{0x8}(0)

// 2. Negative index block adjusts position
// G{G-vptr, 0x7}(+),
// H{0x8}(0),
// Base{0xff}(-)
Copy the code

At this point, the layout of the four base classes of C is processed, and the C object layout can be processed. Follow the following operating principles:

  • In accordance with theCEach base class is processed one by one in the order of the inheritance list;
  • The plain inherited base class layout inserts directly above the index 0 adjacency, marking the entire index as positive(+)If it contains a negative index(-), needs to be moved to the end of the layout;
  • Virtual inheritance must use a Virtual table if the class virtually inherits a base class that does not contain itvptrIs inserted in the first 8 bytes of the object layoutvptr;
  • The base class layout of virtual inheritance is inserted into the end of the layout. During this process, it is necessary to verify each sub-block with negative index. If the same sub-block exists in the memory layout and is also marked with negative index, it is considered redundant and removed only the existing old sub-block in the layout.
  • After all base class layouts are inserted, final alignment of memory is required;
// 1. Set index C to 0
// C{0x3}(0)

// 2. Inherit A and move the negative index below 0
// A{A-vptr, 0x1}(+) <-
// C{0x3}(0)
// Base{0xff}(-) <-

// 3. Handle virtual inheritance D
// A{A-vptr, 0x1}(+)
// C{0x3}(0)
// Base{0xff}(-)
// {D{D-vptr, 0x4}(0), <-
// {Base{0xff}(+), B{0x2}}(-)}(-) <-

// 3. Handle inheritance F
// A{A-vptr, 0x1}(+)
// F{f-vptr, 0x6}(+), <- note is inserted directly above index 0 adjacency
// C{0x3}(0)
// Base{0xff}(-)
// {D{D-vptr, 0x4}(0),           
// {Base{0xff}(+), B{0x2}}(-)}(-)
/ / E {E - VPTR, 0 x 5} (-) < - note taking out the overlapping Base here {0 XFF} (-)

// 4. Handle virtual inheritance H
// A{A-vptr, 0x1}(+)
// F{F-vptr, 0x6}(+),                 
// C{0x3}(0)
// Base{0xff}(-)
// {D{D-vptr, 0x4},           
// {Base{0xff}(+), B{0x2}}(-)}(-)
// E{E-vptr, 0x5}(-)   
// {G{g-vptr, 0x7}(+), <-
// H{0x8}}(-), <-
Base{0xff}(-)

// 5. Final memory alignment
// A{A-vptr, 
// 0x1}(+)
// F{F-vptr, 
// 0x6}(+),                 
// C{0x3}(0), Base{0xff}(-)
// {D{D-vptr, 
// 0x4}, {Base{0xff}(+), 
// B{0x2}}(-)}(-)
// E{E-vptr, 
// 0x5}(-)   
// {G{G-vptr, 
// 0x7}(+), H{0x8}}(-)
Copy the code

Operation logs are as follows.

(lldb) po sizeof(c)
96

(lldb) x/96b &c
0x7ffee720f068: 0xf8 0x10 0x9f 0x08 0x01 0x00 0x00 0x00
0x7ffee720f070: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee720f078: 0x18 0x11 0x9f 0x08 0x01 0x00 0x00 0x00
0x7ffee720f080: 0x06 0x00 0x00 0x00 0x03 0x00 0x00 0x00
0x7ffee720f088: 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee720f090: 0x30 0x11 0x9f 0x08 0x01 0x00 0x00 0x00
0x7ffee720f098: 0x04 0x00 0x00 0x00 0xff 0x00 0x00 0x00
0x7ffee720f0a0: 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee720f0a8: 0x48 0x11 0x9f 0x08 0x01 0x00 0x00 0x00
0x7ffee720f0b0: 0x05 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffee720f0b8: 0x60 0x11 0x9f 0x08 0x01 0x00 0x00 0x00
0x7ffee720f0c0: 0x07 0x00 0x00 0x00 0x08 0x00 0x00 0x00
Copy the code

Next, see how c.g in main accesses object member variables. The assembly code for the main function is as follows. So the Clang compiler stores the offset addresses of all member variables in the object in the virtual table pointed to by the VPTR of the first 8 bytes of the object. Clang determines the mapping between the member variable names and the corresponding slot indexes in the Virtual Table at compile time. The runtime accesses the target member variable by going to Object address -> Virtual Talble address -> Slot index where the target member variable is stored -> offset address of the target member variable in the object.

0x10aa97b60 <+0>:  pushq  %rbp
0x10aa97b61 <+1>:  movq   %rsp, %rbp
0x10aa97b64 <+4>:  subq   $0x70, %rsp
0x10aa97b68 <+8>:  leaq   -0x60(%rbp), %rdi
0x10aa97b6c <+12>: callq  0x10aa97ba0               ; C::C at ViewController.mm:80
0x10aa97b71 <+17>: movq   -0x60(%rbp), %rax ; Locate the virtual table address and save it in the RAX register0x10aa97b75 <+21>: movq   -0x38(%rax), %rax ; Locate the offset of c.g in c instance, note here; The offset of is smaller than the actual offset8Bytes, because; You want to exclude the first VPTR8bytes0x10aa97b79 <+25>: movl   -0x58(%rbp,%rax), %ecx ; Means ($RBP + $rax)-0x58Break it down into better understood ones; Form: ($RBP + $rax +8)-0x60, further disassemble into; ($rbp-0x60)+(+$rax+8), finally meaning "object; Address + the true offset of the member variable c.g in the object"0x10eb50b9d <+29>: movl   %ecx, -0x64(%rbp)
0x10eb50ba0 <+32>: addq   $0x70, %rsp
0x10eb50ba4 <+36>: popq   %rbp
0x10eb50ba5 <+37>: retq   
Copy the code

NOTE: Objects of classes that do not involve virtual mechanism directly generate offset addresses of member variables at compile time, without the indirectness of Virtual table access, so both compilation efficiency and execution efficiency must be higher.

3.4 summary

After the introduction of inheritance in the book, there are two sections that introduce the difference in the access efficiency of member variables in various scenarios, but the principle based on which the conclusion is based has been mentioned in the previous content, so there is no need to repeat it.

The compiler needs a complex memory layout mechanism to produce a deterministic result every time it builds an object of a class, and to facilitate the type conversion between the base class object and its derived class object. Under this premise, the compiler can locate the exact location of the member variable in the object memory through static information such as the name of the object member variable, the type of the object member variable, whether the object involves the virtual mechanism, and the address of the object virtual table.

After virtual table is introduced into the object, the offset of member variables needs to consider the memory space occupied by VPTR in the object. If the exact location of member variables needs to be calculated according to the inheritance structure of the object type every time, the runtime efficiency will be very low. Therefore, the compiler stores the in-memory offset addresses of member variables in the Virtual table, Access the target member variable by choosing Object address > Virtual Talble address > Slot index where the target member variable is stored > offset address of the target member variable in the object.

Chapter 4: Functional semantics

This chapter describes the C++ compiler’s handling of member functions in detail.

4.1 How to call member functions

This section describes how to call virtual and non-virtual functions, that is, how to use them. Temporarily ignore the underlying implementation principle, the underlying implementation is explored in 4.2 and 4.3 through debugging with assembly instructions.

4.1.1 Ordinary member functions

One of the design principles of C++ is that non-static member functions have the same call efficiency as normal functions. Use the following code as an example:

class A {
public:
    int a = 0x1;
    void func(a) {
        a = 0;
    };
};

void mainn(a) {
    A a;
    a.func(a); }Copy the code

To follow the above design guidelines, the C++ compiler needs to do the following for member functions:

  • First of all, the biggest difference between a member function and a normal function is rightthisPointer support because the compiler converts the member function signature tovoid func(A *this).thisFrom where? It is calleda.func()This is the codeaObject address.
  • Second, the function body of a member function can be accessed directly by the name of the member variable. The reason is that the compiler converts the code of the member function body, and the access to the member variable passesthisImplementation, that is, willa = 0intothis->a = 0.
  • Then, you need to rename the function to avoid naming conflicts caused by different types of naming the same function, function overloading, etc. This process is called mangling. Usually, you need to add the class name to the function namefuncintofunc__A_vThe function name after mangle;
  • Finally, the calling code needs to be transformed, i.ea.func()forfunc__A_v(&a);

At this point, each member variable function is essentially converted to a normal function, with virtually no difference in call efficiency.

The main difference between the handling of static and non-static member functions is in mangling. For example, if you declare A static member function in A static int sfunc() {return 0; }, it is converted to a format like sfunc__A_SF_v, where SF indicates that it is a static member function.

Static member functions are often used to access or manipulate static member variables, and since static member variables are global in nature, they must also be mangling to prevent naming conflicts. Virtually all member variables, whether static or not, are mangling by the C++ compiler.

4.1.2 Virtual member functions

Declare func as virtual and continue to see how the C++ compiler handles virtual member functions.

class A {
public:
    int a = 0x1;
    virtual void func(a) {
        a = 0;
    };
};

void mainn(a) {
    A a;
    A *aptr = &a;
    
    a.func(a); aptr->func(a); }Copy the code

The treatment of member functions in the previous section (4.1.1) is still needed. It is important to focus on the transformation of the calling code. Let’s start with a.func(). Since the type of the a object is static, it simply needs to be converted to func__A_v(&a). The aptr->func() process needs to consider polymorphism. First, we need to obtain the VPTR of the object, then fetch the function pointer corresponding to the virtual function in the virtual table to which it points, and finally trigger the function pointer. (* (aptr-> VPTR [1]))(* (aptr-> VPTR [1])) Func function pointer is placed on slot 1 of virtual table index. As you can see, member function calls with virtual mechanisms in mind are much less efficient than regular member function calls.

4.2 Details of virtual functions

Since the Virtual mechanism of the Clang compiler is implemented completely differently than in the book, this chapter is skipped. Instead, briefly debug the virtual function call mechanism under the Clang compiler.

class A {
public:
    int a = 0x1;
    virtual void vfunc(a) {};virtual void vfunc2(a) {}; };class B : public A{
public:
    int b = 0x2;
    virtual void vfunc(a) {};virtual void vfunc2(a) {}; };void main(a) {
    A a;
    B b;
    
    A *aptr = &a;
    A *bptr = &b;
    
    aptr->vfunc2(a); bptr->vfunc2(a); }Copy the code

The corresponding complete assembly code is as follows: A breakpoint is set at 0x107E31DB1 <+65>: callq *0x8(% RCX) and 0x107E31DA7 <+55>: MOVQ -0x30(% RBP), %rax. The first callq *0x8(% RCX) instruction is used to retrieve the function pointer saved by 8 bytes from the positive offset of A’s virtual table and to call the function pointer, which points to vfunc2 of type A. Similarly, the second callq *0x8(% RCX) instruction is used to get the function pointer held at 8 bytes of positive offset of B’s virtual table and to call the function pointer that points to vfunc2 of type B.

0x1097d8d90 <+0>:  pushq  %rbp
0x1097d8d91 <+1>:  movq   %rsp, %rbp
0x1097d8d94 <+4>:  subq   $0x30, %rsp
0x1097d8d98 <+8>:  leaq   -0x10(%rbp), %rdi
0x1097d8d9c <+12>: callq  0x1097d8de0               ; A::A at ViewController.mm:19
0x1097d8da1 <+17>: leaq   -0x20(%rbp), %rdi
0x1097d8da5 <+21>: callq  0x1097d8e00               ; B::B at ViewController.mm:27
0x1097d8daa <+26>: leaq   -0x10(%rbp), %rax
0x1097d8dae <+30>: movq   %rax, -0x28(%rbp) ; At this point-0x28(% RBP) is the address of object A0x1097d8db2 <+34>: leaq   -0x20(%rbp), %rax
0x1097d8db6 <+38>: movq   %rax, -0x30(%rbp) ; At this point-0x30(% RBP) is the address of object B0x1097d8dba <+42>: movq   -0x28(%rbp), %rax
0x107e31d9e <+46>: movq   (%rax), %rcx
0x107e31da1 <+49>: movq   %rax, %rdi
0x107e31da4 <+52>: callq  *0x8(%rcx) ; In this case, RCX is the virtual table address of A0x107e31da7 <+55>: movq   -0x30(%rbp), %rax
0x107e31dab <+59>: movq   (%rax), %rcx
0x107e31dae <+62>: movq   %rax, %rdi
0x107e31db1 <+65>: callq  *0x8(%rcx) ; In this case, RCX is the virtual table address of B0x1097d8dd2 <+66>: addq   $0x30, %rsp
0x1097d8dd6 <+70>: popq   %rbp
0x1097d8dd7 <+71>: retq  
Copy the code
// breakpoint 1 debug log (LLDB) p/x $RCX + 0x08 (unsigned Long) $5 = 0x000000010AFB3040 (LLDB) x/gx 0x000000010AFB3040 0x10AFB3040: 0x000000010afb0ea0 (lldb) po (void(*)(void))0x000000010afb0ea0 (CPPModelDemo`A::vfunc2() at ViewController.mm:25) // Breakpoint 2 Debug Log (LLDB) p/x $RCX + 0x08 (unsigned Long) $10 = 0x000000010AFB3070 (LLDB) x/gx 0x000000010AFB3070 0x10AFB3070: 0x000000010afb0f10 (lldb) po (void(*)(void))0x000000010afb0f10 (CPPModelDemo`B::vfunc2() at ViewController.mm:34)Copy the code

Well, the scene isn’t complicated enough. Change B to virtual inheritance A, class B: Virtual Public A, and the assembly code immediately grows by 10 lines. Parsing assembly code semantics finds that the logic for handling APtr ->vfunc2 remains unchanged, and is still the function pointer saved by calling the virtual table of the object A with positive offset of 8 bytes.

The new code basically handles BPTR ->vfunc2. First, we know from the previous section that the memory of b objects contains two VPTR, namely, the VPTR introduced by type B due to the virtual inheritance of type A (named BVPTR), and the VPTR inherited from type A (named APTR). The following code first locates BVPTR, then offsets -0x18 to obtain the offset of AVPTR in the layout of the B object, and thus locates AVPTR. The function pointer stored in the positive offset of the virtual table to which AVPTR points is vfunc2 to be called this time.

Then take a quick look at the information from the debugging process above:

  • Clang compiler, objectvptrNot the starting address to the virtual table, but the starting address of the list of function Pointers to implemented virtual functions of the current type. So in the first debug code of this section, thevptrThe memory address offset pointing to was found by 8 bytesvfunc2Offset by 0 bytes is exactlyvfunc1;
  • Function Pointers in the Virtual table are sorted according to the order in which the virtual functions are declared, not the order in which they are implementedBIn thevfunc1andvfunc2Position assembly code is essentially unchanged while transposedAIn thevfunc1andvfunc2Position assembly code to get function Pointers relativevptrThe offset of is changed;
0x10be84d00 <+0>:   pushq  %rbp
0x10be84d01 <+1>:   movq   %rsp, %rbp
0x10be84d04 <+4>:   subq   $0x50, %rsp
0x10be84d08 <+8>:   leaq   -0x10(%rbp), %rdi         ; -0x10(%rbp) == &a
0x10be84d0c <+12>:  callq  0x10be84d80               ; A::A at ViewController.mm:19
0x10be84d11 <+17>:  leaq   -0x30(%rbp), %rdi         ; -0x30(%rbp) == &b
0x10be84d15 <+21>:  callq  0x10be84da0               ; B::B at ViewController.mm:28
0x10be84d1a <+26>: xorl %eax, %eax ; The EAX register is cleared0x10be84d1c <+28>: movl %eax, %ecx ; The ECX register is cleared0x10be84d1e <+30>:  leaq   -0x10(%rbp), %rdx         
0x10be84d22 <+34>:  movq   %rdx, -0x38(%rbp)         ; -0x38(%rbp) == &(a->vptr) / / to highlight
0x10be84d26 <+38>:  leaq   -0x30(%rbp), %rdx         ; %rdx == &b
0x10be84d2a <+42>:  cmpq   $0x0, %rdx
0x10be84d2e <+46>:  movq   %rcx, -0x48(%rbp)         ; %rcx == 0.-0x48(%rbp) == 0
0x10be84d32 <+50>:  je     0x10be84d4b               ; <+75> at ViewController.mm
0x10be84d38 <+56>:  movq   -0x30(%rbp), %rax         ; %rax == &(b->bvptr)
0x10be84d3c <+60>:  movq   -0x18(%rax), %rax         ; %rax == 0x10
0x10be84d40 <+64>:  leaq   -0x30(%rbp), %rcx         ; %rax == &b
0x10be84d44 <+68>:  addq   %rax, %rcx                ; %rcx == &b + 0x10 == &(b->avptr)
0x10be84d47 <+71>:  movq   %rcx, -0x48(%rbp)         ; -0x48(%rbp) ==  &(b->avptr)
0x10be84d4b <+75>:  movq   -0x48(%rbp), %rax         ; %rax == &(b->avptr)
0x10be84d4f <+79>:  movq   %rax, -0x40(%rbp)         ; -0x40(%rbp) == &(b->avptr) / / to highlight
0x10be84d53 <+83>:  movq   -0x38(%rbp), %rax
0x10be84d57 <+87> :movq   (%rax), %rcx
0x10be84d5a <+90>:  movq   %rax, %rdi
0x10be84d5d <+93>:  callq  *0x8(%rcx)
0x10be84d60 <+96>:  movq   -0x40(%rbp), %rax
0x10be84d64 <+100> :movq   (%rax), %rcx
0x10be84d67 <+103>: movq   %rax, %rdi
0x10be84d6a <+106>: callq  *0x8(%rcx)
0x10be84d6d <+109>: addq   $0x50, %rsp
0x10be84d71 <+113>: popq   %rbp
0x10be84d72 <+114>: retq   
Copy the code

LLDB debug to see what the two callq *0x8(% RCX) calls are. The first is the debug log, which is called A::vfunc2(). This is fine, because the object type aptr points to and the actual type of A are both A itself, so the implementation of A is called directly.

(lldb) p/x $rcx
(unsigned long) $0 = 0x000000010423a038
(lldb) x/g 0x000000010423a040
0x10423a040: 0x0000000104237ea0
(lldb) po (void(*)(void))0x0000000104237ea0
(CPPModelDemo`A::vfunc2() at ViewController.mm:25)
Copy the code

Then look at the second one, notice that the callq calls something called Virtual Thunk, check the memory in this area, the random clue is broken. However, according to the Po print, this memory area is associated with B::vfunc2().

(lldb) p/x $rcx
(unsigned long) $2 = 0x000000010423a0a0
(lldb) x/g 0x000000010423a0a8
0x10423a0a8: 0x0000000104237f00
(lldb) po (void(*)(void))0x0000000104237f00
(CPPModelDemo`virtual thunk to B::vfunc2() at ViewController.mm)
(lldb) x/4g 0x0000000104237f00
0x104237f00: 0xf87d8948e5894855 0x48088b48f8458b48
0x104237f10: 0x8948c80148e0498b 0x90ffffffb1e95dc7
Copy the code

Going back to the assembly code above, notice the two lines marked “// highlight”. (a-> VPTR) (a-> VPTR) (a-> VPTR) (a-> VPTR) The first 8 bytes of the layout area corresponding to type A in b are used to store the VPTR corresponding to type A in B, so the comment is fine.

The important thing is that both memory addresses are passed into the following function called by Callq via the RDI register, which is actually passed in as the member function A *this parameter. That is, A *aptr = &a and A * BPTR = &b must be A pointer to an object called A to the compiler. Why? Because compilers can’t be penny wise and pound foolish.

  • First of all tobptrA pointed object is converted to a pointAObject pointer, is consistent with the semantics of the code;
  • Second, the C++ compiler must prioritize efficient calls to non-virtual member functions if notbptrDo the above processing, thenbptrEach callAType that is defined by a non-virtual functionbptrPoint to theAObject, which reduces the efficiency of calling non-virtual member functions;

The above operation is called this adjustment in the compiler. Going back to the problem of callq calling virtual Thunk instead of function Pointers. First, thunk is essentially a piece of assembly code that the compiler can encode into platform-specific machine code and write into a block of memory that can be called directly by callQ. So, one of the uses of virtual Thunk above is to redirect the “this pointer that may have been adjusted by this” (which in the above code refers to the layout space of type A in the B object) back to the real object (which in the above code refers to the B object).

From the above debugging process, we also found that in the virtual table of type B, in addition to the above virtual thunk corresponding to the “virtual function list of type A inherited by B”, there is A list of virtual functions of type B itself. Why not just use Virtual Thunk? The reason for the extra maintenance is to reduce the number of calls to virtual Thunk (or the execution of various adjustments).

For example, if we define class C to generically inherit from B, and C does not override vfunc2, then CPTR ->vfunc2(), after we locate C -> BVPTR, can directly call the vfunc2 function pointer in the virtual function list of type B itself. This introduces more this adjustment procedures without calling vfunc2’s Virtual Thunk.

C c;
B *cptr = &c; 
cptr->vfunc2(a);Copy the code

NOTE: In C++, all objects of a class have the same object layout, and the VPTR of all objects also points to the same virtual table. However, the VPTR of a base class in the object layout does not necessarily point to the same virtual table. = b – > aptr).

So much for exploring the implementation of the Virtual function in the Clang compiler. If multiple inheritance is included, the ultimate scenario of “virtual function + virtual inheritance + multiple inheritance” is estimated to be fixed. Finally, there is a point in the book: it is best not to declare non-static member variables in base classes that Virtual inherits, otherwise the cost of understanding the code will increase significantly, both for the compiler and the maintainer itself. Finally, if you must debug the “virtual function + virtual inheritance + multiple inheritance” scenario, it is recommended to use a classic virtual multiple inheritance structure like the following, and then gradually adjust the code flexibly for debugging.

class A {
public:
    int a = 0x1;
    virtual void vfunc(a) {}; };class B : virtual public A{
public:
    int b = 0x2;
    virtual void vfunc(a) {}; };class C : virtual public A{
public:
    int c = 0x3;
};

class D : public B, public C {
public:
    int d = 0x4;
};

void mainn(a) {
    D d;
    
    // This adjustment is not needed because the start address of B's layout space is exactly the start address of D's layout space
    B::vfunc :vfunc :vfunc :vfunc :vfunc
    // Call d-> VPTR to the function pointer, very efficient, but still slightly less efficient than ordinary member function call, after all, indirect
    B *dptr = &d;   
    
    // We can try masking the vfunc in B, this adjustment will take more instructions because we need to locate to A
    // Virtual table to find vfunc implementation
    
    // Case 3: Using this sentence, this adjustment takes more instructions
    // C *dptr = &d; 
    
    dptr->vfunc(a); }Copy the code

NOTE: thunk and this adjustment refer to the Itanium C++ ABI and clang::ThunkInfo Struct Reference. The Clang compiler for iOS should be built using the Itanium C++ ABI. The members of Union VirtualAdjustment are either Itanium or Microsoft, but not Microsoft.

union VirtualAdjustment {
 struct {
   int64_t VCallOffsetOffset;
 } Itanium;

 struct {
   int32_t VtordispOffset;
   int32_t VBPtrOffset;
   int32_tVBOffsetOffset; } Microsoft; . } Virtual;Copy the code

4.3 Pointers to member functions

When using the arrow operator -> to call “pointer to virtual member function” with object pointer, polymorphism also needs to be considered. That is, it also needs to consider which virtual function implementation is called according to the real type of object that object pointer points to at runtime. For example, the fptr0 function pointer does not point to a fixed function, but requires the runtime to determine the type of the object on which the function pointer is called. In the following example, (aptr->*fptr0)() calls the target function vfunc2 of B, And deciding which function to call is not at compile time, but at run time.

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {};virtual void vfunc2(a) {}; };class B : virtual public A{
public:
    int b = 0x2;
    void vfunc1(a) {};void vfunc2(a) {}; };void main(a) {
    void(A::*fptr0)(void) = &A::vfunc2;
    A *aptr = new B;
    (aptr->*fptr0)();
}
Copy the code

Take a look at the main assembly above to see what data the Clang compiler establishes at compile time and how the compiler data can be used to locate a function in the right place at run time. Objects built by new B in the comment are denoted as B. You can focus directly on the code highlighted below:

  • Point 1: In the scenario above, the compiler will point to the Virtual functionA::func2A pointer to theaptrThe value of is marked as0x9;
  • Point 2: If the value of the function pointer is odd, it points to a virtual function, otherwise it points to a normal function. Why is this marked by the compileraptrThe value of0x9One of the reasons for;
  • Important 3: Subtracts function Pointers to virtual functions0x1To get the virtual function relative to the one after this adjustmentthisPointer to the compile-time type of the objectvptr), due tovfunc2inAThe list of member functions is second, so it needs to be offset by 8 bytes. Why is this marked by the compileraptrThe value of0x9The second reason is that. So if it isvoid(A::*fptr0)(void) = &A::vfunc1,aptrThe value of0x1;
  • Key 4: If a function pointer points to a non-virtual function, the value of the function pointer is directly available;

To sum up, the Clang compiler sets the value of the function pointer to the Virtual function to “the offset of the Virtual function in the Virtual table to which this pointer points after this Adjustment” + 1. The purpose of adding 1 is to convert it to an odd number, differentiating it from non-virtual function Pointers (which are necessarily even because they are memory addresses and 8-byte aligned).

0x1083b7cf0 <+0>:   pushq  %rbp
0x1083b7cf1 <+1>:   movq   %rsp, %rbp
0x1083b7cf4 <+4>:   subq   $0x40, %rsp
0x1083b7cf8 <+8>:   movq   $0x0.-0x8(%rbp)          ; -0x8(%rbp) == 0
0x1083b7d00 <+16>:  movq   $0x9.-0x10(%rbp) ; (the key1)-0x10(%rbp) == 9, corresponding code: fpTR0 = &A::vfunc20x1083b7d08 <+24>:  movl   $0x20, %edi ; Designated heap allocation32Bytes (size of type B instance)0x1083b7d0d <+29>:  callq  0x1083b8446               ; symbol stub for: operator new(unsigned long)
0x1083b7d12 <+34>:  movq   %rax, %rdi                ; $rdi == &b
0x1083b7d15 <+37>:  movq   %rax, -0x20(%rbp)         ; -0x20(%rbp) == &b
0x1083b7d19 <+41>:  callq  0x1083b7db0               ; B::B at ViewController.mm:26
0x1083b7d1e <+46>: xorl %ecx, %ecx ; Clear the ECX register0x1083b7d20 <+48>: movl %ecx, %eax ; Clear the EAX register0x1083b7d22 <+50>:  movq   -0x20(%rbp), %rdx         ; $rdx == &b
0x1083b7d26 <+54>:  cmpq   $0x0, %rdx ; If the previousnewThe following JE is executed on failure0x1083b7d2a <+58>:  movq   %rax, -0x28(%rbp) ; B The object address is saved to-0x28(%rbp)
0x1083b7d2e <+62>:  je     0x1083b7d46               ; <+86> at ViewController.mm
0x1083b7d34 <+68>:  movq   -0x20(%rbp), %rax         ; $rax == &b
0x1083b7d38 <+72>:  movq   (%rax), %rcx              ; $rcx == b->vptr
0x1083b7d3b <+75>:  movq   -0x18(%rcx), %rcx ; $RCX == (b->avptr offset in b's object layout)0x1083b7d3f <+79>:  addq   %rcx, %rax                ; $rax == &(b->avptr)
0x1083b7d42 <+82>:  movq   %rax, -0x28(%rbp)         ; -0x28(%rbp) == &(b->avptr)
0x1083b7d46 <+86>:  movq   -0x28(%rbp), %rax         ; $rax == &(b->avptr)
0x1083b7d4a <+90>:  movq   %rax, -0x18(%rbp)         ; -0x18(%rbp) == &(b->avptr)
0x1083b7d4e <+94>:  movq   -0x18(%rbp), %rax         ; $rax == &(b->avptr)
0x1083b7d52 <+98>:  movq   -0x10(%rbp), %rcx         ; $rcx == 0x9
0x1083b7d56 <+102>: movq   -0x8(%rbp), %rdx          ; $rdx == 0x0
0x1083b7d5a <+106>: addq   %rdx, %rax                ; $rax == &(b->avptr)
0x1083b7d5d <+109>: movq   %rcx, %rdx                ; $rdx == 0x9
0x1083b7d60 <+112>: andq   $0x1, %rdx ; (the key2$RDX = =)0x1
0x1083b7d67 <+119>: cmpq   $0x0, %rdx ; Due to the RDX! =0So the JE jump below is not performed0x1083b7d6b <+123>: movq   %rcx, -0x30(%rbp)         ; -0x30(%rbp) == 0x9
0x1083b7d6f <+127>: movq   %rax, -0x38(%rbp)         ; -0x38(%rbp) == &(b->avptr)
0x1083b7d73 <+131>: je     0x1083b7d98               ; <+168> at ViewController.mm
0x1083b7d79 <+137>: movq   -0x38(%rbp), %rax         ; $rax == &(b->avptr)
0x1083b7d7d <+141>: movq   (%rax), %rcx              ; $rcx == b->avptr
0x1083b7d80 <+144>: movq   -0x30(%rbp), %rdx         ; $rdx == 0x9
0x1083b7d84 <+148>: subq   $0x1, %rdx ; (the key3$RDX = =)0x9-0x1 = 0x8
0x1083b7d8b <+155>: movq   (%rcx,%rdx), %rcx         ; $rcx == (b->avptr + 0x8)
0x1083b7d8f <+159>: movq   %rcx, -0x40(%rbp)         ; -0x40(%rbp) == (b->avptr + 0x8)
0x1083b7d93 <+163>: jmp    0x1083b7da0               ; <+176> at ViewController.mm
0x1083b7d98 <+168>: movq   -0x30(%rbp), %rax ; (the key4) target jump location of je code above.-0x30(%rbp) == fptr0
0x1083b7d9c <+172>: movq   %rax, -0x40(%rbp)         ; -0x40(%rbp) == fptr0
0x1083b7da0 <+176>: movq   -0x40(%rbp), %rax ; Above JMP jumps here. $rax == (b->avptr +0x8)
0x1083b7da4 <+180>: movq   -0x38(%rbp), %rdi ; $rdi = &(b->avptr)thisPointer to the0x1083b7da8 <+184>: callq *%rax ; Call the function pointer saved by $rax0x1083b7daa <+186>: addq   $0x40, %rsp
0x1083b7dae <+190>: popq   %rbp
0x1083b7daf <+191>: retq  
Copy the code

If you (aptr – > * fptr0) (); Replace with aptr – > vfunc2 (); The compiled code is immediately reduced to 30 lines (50 lines above), so accessing virtual functions through function Pointers is definitely less efficient than accessing them directly. But it shouldn’t be much lower, just a few more simple memory addressing operations.

If vfunc1 and vfunc2 were declared as non-virtual member functions, would the assembly code functions of main be reduced? No. Since most of the code in Main is necessary for Clang to handle function Pointers, it doesn’t matter whether a virtual function pointer is accessed. The biggest difference is that the instructions for fpTR0 = &A::vfunc2 become leaq 0xC1 (%rip), %rax and movq %rax, -0x10(% RBP), because the compiler knows A::vfunc2 is non-virtual at compile time. So it goes directly to the address of A::vfunc2.

class A {
public:
    int a = 0x1;
    void vfunc1(a) {};void vfunc2(a) {}; };class B : virtual public A{
public:
    int b = 0x2;
};

void main(a) {
    void(A::*fptr0)(void) = &A::vfunc2;
    A *aptr = new B;
    (aptr->*fptr0)();
}
Copy the code

NOTE: The introduction of multiple inheritance should not increase the instruction complexity of Main, because no matter how complex the inheritance structure is, the compiler already handles this complexity at compile time and converts it into memory-addressed static data to assembly code. The runtime just addresses and retrieves the data.

4.4 summary

This section debugs how the Clang compiler handles code that calls member functions and Pointers to member functions. The efficiency of ordinary member functions is equivalent to that of non-member functions. The compiler adds a this pointer parameter to the member function, pointing to the object itself, and identifies all member functions globally through mangling, combining the name of the class that defines the member function with the type of the parameter list of the function.

When a member function is called through the arrow operator ->, the passed this pointer is sometimes referred to as this Adjustment. The purpose of this adjustment is to locate “the memory address of the class that defines the member function in the memory layout of the object”. However, to call a virtual member function, passing this pointer is required to point to the object itself (because the virtual function defined by the base class can be override, and the override function sometimes requires access to the member variable or function defined by the derived class itself). Therefore, the compiler needs to do additional preprocessing (such as restoring this adjustment back), in which case the code for preprocessing and calling the target Virtual function is converted into machine instructions to the Virtual Thunk. Also called by callq.

Finally, the C++ compiler calls the member function pointed to by the function pointer, as opposed to calling the member function directly. The root cause is the need to consider the polymorphic case. At compile time, the Clang compiler sets the function pointer to the Virtual function to “the offset of this virtual function in the Virtual table pointed to by this pointer after this Adjustment” plus 1. The value of 1 is added to mark it as an odd number to distinguish it from ordinary function Pointers. The offset will naturally locate the virtual table slot (virtual function address or Virtual thunk) to which the pointer actually points.

Chapter 5: Construct destructive copy semantics

The systematic knowledge points about C++ compilation can be said to have been basically introduced in the first four chapters of the book. From the fifth chapter, it is basically a collection of some fragmented knowledge points, as well as the details of the previous content supplement. The first few points of the chapter are fragmentary:

  • There is no need to provide a function implementation in the base class to declare a pure virtual function, for examplevirtual void vfunc() = 0;;
  • As for the question of whether a virtual function needs to be declared const, since there is no guarantee that the derived class implementing the function will need to modify the object’s members, the more straightforward approach is not to use const;

The rest of the content is basically in this form, and chapters five, six and seven are not summarized.

5.1 Construct semantics

What exactly does this code contain:

  • Call base class constructors recursively;
  • Recursive calls to the virtual inherited base class constructor;
  • Initialize thevptr;
  • Perform initialization of initial values of member variables;
  • Performs the actions specified in the constructor’s initializer list;
  • If there are uninitialized members of class Object, the default constructor for that member is executed.
  • Execute the code in the default constructor body;
T t;
Copy the code

NOTE: By default, the initial value of a class Object member in an object is not 0x0. Instead, it is initialized by calling its default constructor. This is why it is recommended to initialize the Class Object member in the constructor initializer list, because there are redundant operations in the function body.

For the initialization of VPTR Pointers at all levels in an object, the execution point is before the constructor initializer list is executed. So it is legal to call a virtual function in the constructor initializer list to initialize a member.

5.2 Replication semantics

When designing a class, the operation that specifies “assign an object of class to another object” is usually implemented in one of three ways:

  • Do nothing and use the default copy constructor policy.
  • The customoperator=Assignment operator;
  • Disallow assigning an object of class to another object;

If you want to disable object assignment, you can declare operator= private and not provide a definition of the function. If it is not necessary to forbid object assignment, the default copy constructor strategy should be preferred, such as simple member-by-member assignment. The custom operator= assignment operator is selected only if the default copy constructor policy is unsafe or incorrect. This article uses bitwise Copy Semantic as an example to explain why, but Clang itself does not support Bitwise Copy Semantic, so this section will be skipped.

5.3 Deconstructive semantics

If a class does not define a destructor, the compiler synthesizes the destructor for that class only if a member variable of the class or the base class has a destructor (note that no destructor is synthesized at all if no destructor is needed).

NOTE: for more information on why destructors need to be declared virtual, see this article: why destructors for base classes in C++ use virtual destructors.

Finally, the destructor will execute when the following code deletes t:

T *t = newT; .delete t;
Copy the code

Destructor execution triggers the following operations:

  • The destructor function body is executed first (due to passing indeleteIs the pointer type, so you need to be careful about polymorphism.
  • If the class of several member variables in an object defines a destructor, it is called in the reverse order in which the member variables are declared.
  • If the object containsvptr, you need to reset to point to the virtual table of the appropriate base class.
  • If any direct non-virtual inherited base class defines a destructor, it will be called in the reverse order of its declaration.
  • If any direct virtual inheritance base class defines a destructor, it will be called in the reverse order of its declaration;

During object destruction, VPTR of an object degrades to the upper base class step by step. For example, if A is the root class and does not inherit from any type, B inherits from A, and C inherits from B, then the object of type C is degraded to B, where VPTR points to the virtual table of class B, and then to A, where VPTR points to the virtual table of class A. Finally A destructor is called and the entire object is destructed.

The degradation process can be briefly verified with the following code. When we delete b from main, the destructor of class B is triggered first, and then the destructor of ~A() of base class is necessarily triggered. When we call vfunc1() in base class (equivalent to this->vfunc1()), the destructor of ~A() is triggered. The object has been degraded to type A, and of course THE VPTR has been adjusted to point to A’s virtual table.

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {
        / / breakpoint 1
    };
    virtual ~A() {
        vfunc1(a); }; };class B : virtual public A{
public:
    int b = 0x2;
    void vfunc1(a) {
        2 / / points
    };
};

void main(a) {
    B *b = new B;
    delete b;
}
Copy the code

Chapter 6: Runtime semantics

This chapter focuses on some of the extra operations the C++ compiler needs to add to ensure that programs run correctly.

6.1 Object construction and destruction

Here’s an example of how the compiler handles objects that are local variables:

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {};virtual ~A() {}; };class B : virtual public A{
public:
    int b = 0x2;
    void vfunc1(a) {}; };void main(a) {
    B b;
}
Copy the code

Because virtual inheritance base class A declares and defines A destructor, the compiler automatically synthesizes the destructor ~B() for B. As you can see from the following assembly code for the main function:

  • The compiler will declare itbAfter the variable is inserted, the default constructor is calledB()Code;
  • And it’s exiting the variablebBefore inserting the call destructor into the scope of~B()Code;
0x10df3bcb0 <+0>:  pushq  %rbp
0x10df3bcb1 <+1>:  movq   %rsp, %rbp
0x10df3bcb4 <+4>:  subq   $0x20, %rsp
0x10df3bcb8 <+8>:  leaq   -0x20(%rbp), %rdi
0x10df3bcbc <+12>: callq  0x10df3bcd0               ; B::B at ViewController.mm:26
0x10df3bcc1 <+17>: leaq   -0x20(%rbp), %rdi
0x10df3bcc5 <+21>: callq  0x10df3bd30               ; B::~B at ViewController.mm:26
0x10df3bcca <+26>: addq   $0x20, %rsp
0x10df3bcce <+30>: popq   %rbp
0x10df3bccf <+31>: retq   
Copy the code

6.1.1 Global variables

C++ for global object processing, at the comment point breakpoint, compile run:

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {};virtual ~A() {}; };class B : virtual public A{
public:
    int b = 0x2;
    void vfunc1(a) {}; }; A a; B b;// Set a breakpoint

void main(a) {
    B x = b;
}
Copy the code

The assembly code corresponding to the breakpoint is as follows: and uses BT instructions to print the stack frame at this time. From the debug section, you can observe that under the current iOS Simulator, the global object initialization is done when the program’s binary is loaded:

  • For each defined global variable object file source/header file, similar results are generated_GLOBAL__sub_I_ViewController.mmIs used to initializeViewController.mmAll global variables defined in the file;
  • For each global variable object, similar results are generated__cxx_global_var_init.1Function, used to initialize a single global variable object, obviously the function will be declared in the file corresponding to the variable_GLOBAL__xxxFunction call;

See what __cxx_global_var_init.1 does:

  • Calling the constructorB()Initialize global variablesbCan be observedbThe address is from the data segment (through$ripLocation data is usually data from data segments);
  • call__cxa_atexitRegister global variablesbDestructor of~B()When the program exits;

In short, the C++ compiler guarantees that all global objects are initialized before the program is actually run, so to say before the main function executes, and that all global objects are destructed before the program exits.

CPPModelDemo`__cxx_global_var_init1.:
    0x1024dde90 <+0>:  pushq  %rbp
    0x1024dde91 <+1>:  movq   %rsp, %rbp
->  0x1024dde94 <+4>:  leaq   0x77bd(%rip), %rdi        ; b
    0x1024dde9b <+11>: callq  0x1024ddb70               ; B::B at ViewController.mm:24
    0x1024ddea0 <+16>: leaq   -0x2d7(%rip), %rax        ; B::~B at ViewController.mm:24
    0x1024ddea7 <+23>: leaq   0x77aa(%rip), %rcx        ; b
    0x1024ddeae <+30>: movq   %rax, %rdi
    0x1024ddeb1 <+33>: movq   %rcx, %rsi
    0x1024ddeb4 <+36>: leaq   -0x1ebb(%rip), %rdx       ; _mh_execute_header
    0x1024ddebb <+43>: callq  0x1024de43c               ; symbol stub for: __cxa_atexit
    0x1024ddec0 <+48>: popq   %rbp
    0x1024ddec1 <+49>: retq  
Copy the code
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
  * frame #0: 0x00000001024dde94 CPPModelDemo`__cxx_global_var_init.1 at ViewController.mm:32:3
    frame #1: 0x00000001024ddeeb CPPModelDemo`_GLOBAL__sub_I_ViewController.mm at ViewController.mm:0
    frame #2: 0x0000000102509c95 dyld_sim`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 513
    frame #3: 0x000000010250a08a dyld_sim`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #4: 0x0000000102504bb7 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 455
    frame #5: 0x0000000102502ec7 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
    frame #6: 0x0000000102502f68 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
    frame #7: 0x00000001024f626b dyld_sim`dyld::initializeMainExecutable() + 199
    frame #8: 0x00000001024faf56 dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 4789
    frame #9: 0x00000001024f51c2 dyld_sim`start_sim + 122
    frame #10: 0x000000010481f57a dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*) + 2093
    frame #11: 0x000000010481cdf3 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 1199
    frame #12: 0x000000010481722b dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 457
    frame #13: 0x0000000104817025 dyld`_dyld_start + 37
Copy the code

NOTE: there may be some discrepancies between what you observe and what the book describes (compiler differences are a constant feature of C++, after all), but fortunately the essence is the same. __cxx_global_var_init ¶ __cxx_global_var_init () ¶ __cxx_global_var_init () ¶ __cxx_global_var_init () ¶

6.1.2 Local static variables

Replace the main function of the debugged code above with the following code to compile and run:

void main(a) {
    static B stb;
}
Copy the code

CMPB $0x0, 0x7a49(%rip), CMPB $0x0, 0x7a49(%rip), CMPB $0x0, 0x7a49(%rip) The 0x10a27dC5F instruction line and 0x10a27dCB5 are obviously used for STB initialization, and null is required for initialization. Note also __cxA_guard_acquire and __cxA_guard_release, which should lock the STB initialization operation to prevent reentrapment. The rest of the code is obvious, much like global variables, by calling the constructor B() to initialize the STB, and then calling __cxa_atexit to register the STB destructor, which is triggered before the program exits.

CPPModelDemo`mainn:
    0x10a27dc50 <+0>:   pushq  %rbp
    0x10a27dc51 <+1>:   movq   %rsp, %rbp
    0x10a27dc54 <+4>:   subq   $0x10, %rsp
    0x10a27dc58 <+8>:   cmpb   $0x0.0x7a49(%rip)        ; mainn()::stb + 31
    0x10a27dc5f <+15>:  jne    0x10a27dcb5               ; <+101> at ViewController.mm:36:1
    0x10a27dc65 <+21>:  leaq   0x7a3c(%rip), %rdi        ; guard variable for mainn()::stb
    0x10a27dc6c <+28>:  callq  0x10a27e422               ; symbol stub for: __cxa_guard_acquire
    0x10a27dc71 <+33>:  cmpl   $0x0, %eax
    0x10a27dc74 <+36>:  je     0x10a27dcb5               ; <+101> at ViewController.mm:36:1
->  0x10a27dc7a <+42>:  leaq   0x7a07(%rip), %rdi        ; mainn()::stb
    0x10a27dc81 <+49>:  callq  0x10a27dbb0               ; B::B at ViewController.mm:24
    0x10a27dc86 <+54>:  leaq   -0x7d(%rip), %rax         ; B::~B at ViewController.mm:24
    0x10a27dc8d <+61>:  leaq   0x79f4(%rip), %rcx        ; mainn()::stb
    0x10a27dc94 <+68>:  movq   %rax, %rdi
    0x10a27dc97 <+71>:  movq   %rcx, %rsi
    0x10a27dc9a <+74>:  leaq   -0x1ca1(%rip), %rdx       ; _mh_execute_header
    0x10a27dca1 <+81>:  callq  0x10a27e41c               ; symbol stub for: __cxa_atexit
    0x10a27dca6 <+86>:  leaq   0x79fb(%rip), %rdi        ; guard variable for mainn()::stb
    0x10a27dcad <+93>:  movl   %eax, -0x4(%rbp)
    0x10a27dcb0 <+96>:  callq  0x10a27e428               ; symbol stub for: __cxa_guard_release
    0x10a27dcb5 <+101>: addq   $0x10, %rsp
    0x10a27dcb9 <+105>: popq   %rbp
    0x10a27dcba <+106>: retq 
Copy the code

6.1.3 Object Arrays

Replace the main function of the debugged code above with the following code to compile and run:

void main(a) {
    B barray[10];
}
Copy the code

Would you think that the assembly code for main should be short? Reality blew up, 50 lines of assembly instructions came out. It’s a bit too much code, but the core operations are fairly obvious:

  • Construction: Allocate 10 blocks consecutively on the stackBObject, and calls the constructor separatelyB();
  • Destructor: for 10 consecutive stacks allocatedBObject, which in turn executes the destructor~B();
CPPModelDemo`mainn:
    0x1013e0bd0 <+0>:   pushq  %rbp
    0x1013e0bd1 <+1>:   movq   %rsp, %rbp
    0x1013e0bd4 <+4>:   subq   $0x180, %rsp              ; imm = 0x180 
    0x1013e0bdb <+11>:  leaq   -0x150(%rbp), %rax
    0x1013e0be2 <+18>:  movq   0x242f(%rip), %rcx        ; (void *)0x00007fff8000a0e0: __stack_chk_guard
    0x1013e0be9 <+25>:  movq   (%rcx), %rcx
    0x1013e0bec <+28>:  movq   %rcx, -0x8(%rbp)
->  0x1013e0bf0 <+32>:  movq   %rax, %rcx
    0x1013e0bf3 <+35>:  addq   $0x140, %rcx              ; imm = 0x140 
    0x1013e0bfa <+42>:  movq   %rcx, -0x158(%rbp)
    0x1013e0c01 <+49>:  movq   %rax, -0x160(%rbp)
    0x1013e0c08 <+56>:  movq   -0x160(%rbp), %rax
    0x1013e0c0f <+63>:  movq   %rax, %rdi
    0x1013e0c12 <+66>:  movq   %rax, -0x168(%rbp)
    0x1013e0c19 <+73>:  callq  0x1013e0b30               ; B::B at ViewController.mm:24
    0x1013e0c1e <+78>:  movq   -0x168(%rbp), %rax
    0x1013e0c25 <+85>:  addq   $0x20, %rax
    0x1013e0c2b <+91>:  movq   -0x158(%rbp), %rcx
    0x1013e0c32 <+98>:  cmpq   %rcx, %rax
    0x1013e0c35 <+101>: movq   %rax, -0x160(%rbp)
    0x1013e0c3c <+108>: jne    0x1013e0c08               ; <+56> at ViewController.mm
    0x1013e0c42 <+114>: leaq   -0x150(%rbp), %rax
    0x1013e0c49 <+121>: movq   %rax, %rcx
    0x1013e0c4c <+124>: addq   $0x140, %rcx              ; imm = 0x140 
    0x1013e0c53 <+131>: movq   %rax, -0x170(%rbp)
    0x1013e0c5a <+138>: movq   %rcx, -0x178(%rbp)
    0x1013e0c61 <+145>: movq   -0x178(%rbp), %rax
    0x1013e0c68 <+152>: addq   $-0x20, %rax
    0x1013e0c6e <+158>: movq   %rax, %rdi
    0x1013e0c71 <+161>: movq   %rax, -0x180(%rbp)
    0x1013e0c78 <+168>: callq  0x1013e0b90               ; B::~B at ViewController.mm:24
    0x1013e0c7d <+173>: movq   -0x180(%rbp), %rax
    0x1013e0c84 <+180>: movq   -0x170(%rbp), %rcx
    0x1013e0c8b <+187>: cmpq   %rcx, %rax
    0x1013e0c8e <+190>: movq   %rax, -0x178(%rbp)
    0x1013e0c95 <+197>: jne    0x1013e0c61               ; <+145> at ViewController.mm
    0x1013e0c9b <+203>: movq   0x2376(%rip), %rax        ; (void *)0x00007fff8000a0e0: __stack_chk_guard
    0x1013e0ca2 <+210>: movq   (%rax), %rax
    0x1013e0ca5 <+213>: movq   -0x8(%rbp), %rcx
    0x1013e0ca9 <+217>: cmpq   %rcx, %rax
    0x1013e0cac <+220>: jne    0x1013e0cbb               ; <+235> at ViewController.mm
    0x1013e0cb2 <+226>: addq   $0x180, %rsp              ; imm = 0x180 
    0x1013e0cb9 <+233>: popq   %rbp
    0x1013e0cba <+234>: retq   
    0x1013e0cbb <+235>: callq  0x1013e1432               ; symbol stub for: __stack_chk_fail
    0x1013e0cc0 <+240>: ud2    
Copy the code

NOTE: For __stack_chk_guard in assembly code above, see article: Compiler stack protection techniques in GCC. B barray[10]; Is much more complex than you might think.

6.3 New and DELETE operators

About the new and delete operators:

  • newThe operator is used to allocate memory on the heap and returns a pointer to the starting address of the allocated memory.
  • deleteOperators are used to free up memory allocated on the heap;

Note that when allocating/freeing arrays on the heap, new [] and delete [] are usually used together. Such as:

void main(a) {
    B *barray = new B[10];
    delete [] barray;
}
Copy the code

NOTE: The Clang compiler does not seem to support Placement Operator new semantics as described in the book.

6.3 Temporary Objects

Temporary objects are temporary objects that are introduced by the compiler to achieve the effect specified in the code. They are not declared in the code. Take the following simple debug code as an example. Both debug code 1 and Debug Code 2 implement an x + y expression by introducing a temporary object to hold the result of the expression, but their execution efficiency is different.

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {}; Aoperator+ (const A &y) {
        A result;
        result.a = a + y.a;
        returnresult; }};// Debug code 1
void main(a) {
    A x, y;
    A sum;
    sum = x + y;
}

// Debug code 2
void main(a) {
    A x, y;
    A sum = x + y;
}
Copy the code

Because of compiler NVR optimization, A::operator+ (const A &y) is actually passing in A temporary stack address to hold the returned object. The RDI register is the address of the returned object. At this point -0x40(% RBP) is used to hold the temporary object corresponding to the expression X + y in the debug code.

NOTE: why not just pass the sum variable address -0x30(% RBP) as the return address? I have to say that I do get the correct result in the debugging code above. The compiler cannot ignore any custom logic that A::operator= might have in sum = x + y.

CPPModelDemo`mainn:
    0x101b35740 <+0>:  pushq  %rbp
    0x101b35741 <+1>:  movq   %rsp, %rbp
    0x101b35744 <+4>:  subq   $0x40, %rsp
    0x101b35748 <+8>:  leaq   -0x10(%rbp), %rdi
    0x101b3574c <+12>: callq  0x101b35790               ; A::A at ViewController.mm:18
    0x101b35751 <+17>: leaq   -0x20(%rbp), %rdi
    0x101b35755 <+21>: callq  0x101b35790               ; A::A at ViewController.mm:18
    0x101b3575a <+26>: leaq   -0x30(%rbp), %rdi
    0x101b3575e <+30>: callq  0x101b35790               ; A::A at ViewController.mm:18
->  0x101b35763 <+35>: leaq   -0x40(%rbp), %rdi
    0x101b35767 <+39>: leaq   -0x10(%rbp), %rsi
    0x101b3576b <+43>: leaq   -0x20(%rbp), %rdx
    0x101b3576f <+47>: callq  0x101b357b0               ; A::operator+ at ViewController.mm:22
    0x101b35774 <+52>: leaq   -0x30(%rbp), %rdi
    0x101b35778 <+56>: leaq   -0x40(%rbp), %rsi
    0x101b3577c <+60>: callq  0x101b35810               ; A::operator= at ViewController.mm:18
    0x101b35781 <+65>: addq   $0x40, %rsp
    0x101b35785 <+69>: popq   %rbp
    0x101b35786 <+70>: retq 
Copy the code

Now look at the assembly code for debug code two. The code A sum = x + y still returns the address of the object through the RDI register, but it does not call A::operator= because the = is the object initialization semantics to the compiler, not the A::operator= calling semantics. Therefore, debug code 2 is significantly more efficient, and the root cause is semantic difference.

NOTE: To illustrate the initialization semantics mentioned above. For example, the compiler processing A result = A() statement does not need to call the A() constructor and then call the A::operator= copy constructor again, which would be the most typical initialization semantics for =.

CPPModelDemo`main:
    0x103cab770 <+0>:  pushq  %rbp
    0x103cab771 <+1>:  movq   %rsp, %rbp
    0x103cab774 <+4>:  subq   $0x30, %rsp
    0x103cab778 <+8>:  leaq   -0x10(%rbp), %rdi
    0x103cab77c <+12>: callq  0x103cab7b0               ; A::A at ViewController.mm:18
    0x103cab781 <+17>: leaq   -0x20(%rbp), %rdi
    0x103cab785 <+21>: callq  0x103cab7b0               ; A::A at ViewController.mm:18
->  0x103cab78a <+26>: leaq   -0x30(%rbp), %rdi
    0x103cab78e <+30>: leaq   -0x10(%rbp), %rsi
    0x103cab792 <+34>: leaq   -0x20(%rbp), %rdx
    0x103cab796 <+38>: callq  0x103cab7d0               ; A::operator+ at ViewController.mm:22
    0x103cab79b <+43>: addq   $0x30, %rsp
    0x103cab79f <+47>: popq   %rbp
    0x103cab7a0 <+48>: retq   
Copy the code

So when does the compiler specify to release temporary objects? Typically, temporary variables introduced during expression evaluation can be released as soon as the code in which the expression resides completes execution. For example, sum = x + y above can be released immediately after the completion of the x + y expression. But there are two exceptions:

  • Declare objects with temporary variables. For example,A sum = x + yThe expressionx + yThe generated temporary object actually becomessumThe need is beyondsumCan be released in the scope of
  • Bind temporary variables to references. For example,const A &sumRef = x + yIf released immediatelyx + yThe generated temporary object refers to the typesumRefWill lose meaning;

C++ generates temporary variables, resulting in performance problems compared to Fortran. Part of the reason for this problem is the low efficiency of reading and writing temporary objects on the stack. The de-aggregation optimization mentioned in this article is to put some of the member variables of the object directly into the register to improve efficiency, but the compiler generally does not consider this as a key optimization direction.

Chapter 7: Stand on the tip of the object model

7.1 template

When the compiler sees a template definition, it does not immediately generate the template’s binary code. Instead, it does so when it sees code that uses the template. For example, if a template class defines 200 member functions and only three of them are used in the code, the compiler will only instantiate those three functions to generate its binary code. The reason for this approach is that:

  • Time and space efficiency considerations. The instantiation operation itself has a compilation time cost, and the compilation to binary takes up space in the binary, so it has a space cost;
  • Non-essential function considerations. A template must be bound to A specific type. The functions or types that need to be instantiated to bind type A and type B are not exactly the same.

More specifically, the compiler can choose to process template instantiation in two phases:

  • Compilation phase: During compilation, when the compiler sees the code using the template, it immediately executes the instantiation operation.
  • Link phase: During link, the compiler goes back to handle all instantiations uniformly.

Template instantiation is lazy enough that, for example, the compiler does not trigger the instantiation operation because pt is just a pointer, and the compiler can still build pt Pointers without knowing the object layout of Point

and the composition of the member functions:

Point<float> *pt = 0;
Copy the code

Note, however, that the following code triggers the instantiation operation because the reference can’t Point to null, and the compiler actually converts it to two lines of code commented at the bottom of the code, which is the build process with Point

:

Point<float> &pt = 0;

// The compiler converts the above code into the following two lines of code
// Point<float> temp = Point<float>(0);
// Point<float> &pt = temp;
Copy the code

Because the compiler does not have complete information about a template until it is bound to a specific type, the compiler often lags behind in checking for errors in template definitions than it does in non-templates. The most obvious are some type checking related operations.

The book Name Resolution within a Template states that there are two scopes to consider when parsing names in Template code:

  • Scope of Template Declaration: Template Declaration Scope;
  • Scope of Template Initiation: Template instantiation Scope;

A template declares scope when a non-template function foo is called that is not associated with the template’s specific binding type T, so debug code one fires extern double foo(double). The template instantiation scope is used when a call to a non-template function foo is associated with the template’s specific binding type T, so debugging code 2 fires extern int foo(int).

However, when debugging with Clang, both debugger one and debugger two trigger extern double foo(double), which means they both use a template declaration scope. No matter how you position the code in different source/header files, the result is the same. So Clang probably doesn’t use these rules.

extern double foo(double);

template <class T>
class ScopeRule {
private:
    int _val;
    T _member;
public:
    void t_independent(a) {
        _member = foo(_val);
    }
    T t_dependent(a) {
        return foo(_member); }};void main(a) {
    extern int foo(int);
    
    ScopeRule<int> sr0;
    
    T_independent 'is not associated with' T 'because the argument' _val 'passed to' foo 'is always of type' int '.
    // Extern double foo(double) in the scope of a template declaration is triggered because it does not change with the specific type of T
    sr0.t_independent(a);T_dependent is dependent on T because the type of the parameter '_member' passed to foo depends on T
    Extern int foo(int) = extern int foo(int) = extern int foo(int
    int x = sr0.t_dependent(a); }double foo(double) {
    return 0.0;
}

int foo(int) {
    return 100;
}
Copy the code

Member function instantiation is the biggest difficulty in implementing templates. But this part of the book is really written too jump, and the content involved are compiler differences, so it is not very good to understand, skip for the moment. But here’s a summary of how the compiler handles templates:

  • The compilation process does not generate any template instantiation, only the relevant information is recorded in the corresponding object files of the source file.
  • When linking object Files, Prelinker checks object files to collect information about template instantiation.
  • For all object files that need to be instantiated, the corresponding template is instantiated, and the instantiation operation is registered in the II file corresponding to the object file.
  • Prelinker recompiles all the “corresponding II files have changed” source files until all the instantiation operations are complete;
  • All object files are linked into executable files;

It is also mentioned in the book that this approach needs to be supported by a specific process, otherwise there will be bugs, and the causal logic in it is what I do not understand. However at first glance look at the above process, give a person efficiency is not very high feeling.

NOTE: From the reference for Class Template Instantiation, it is also necessary to know the concept of template instantiation during the coding process. This roughly corresponds to the “template specialisation” in C++ Primer.

7.2 Exception Handling

It is known that each time a function is called, the call stack will push the called function and increase by one layer, and when the function returns, the call stack will pop the current function and decrease by one layer. The exception handling process at the upper level of C++ is as follows:

  • When the current function that throws an Exception at runtime throws an Exception, the Exception catching mechanism needs to generate the corresponding Exception object instance.
  • It then causes the current function to relinquish control, which means unwinding the current function from the call stack (the process of unwinding the call stack one layer at a time is called unwinding the stack). All destructors for the local Class object of the function are called before the function pops out;
  • In the process of layer by layer pop-up, if the exception is by the current layertry-catchMechanism catches (note that you still need to handle the necessary destructions of local Class objects yourself) and determines the specific type of exception through RTTI to enter the correctcatchIn a logical branch;
  • In the process of layer by layer pop-up, if the exception is not by anytry-catchThe C++ exception handling mechanism is calledterminate()Termination procedure;

NOTE: a detailed Clang Exception catching mechanism can be found in the Itanium C++ ABI: Exception Handling.

7.3 RTTI (Runtime Type Identification)

Runtime type recognition is partly for the purpose of exception handling. When catching an exception, the try-catch mechanism needs to determine the specific type of exception at run time to enter the corresponding branch of processing logic, so RTTI support is required.

7.3.1 down cast

Down cast refers to the transformation from derived class to base class. There are two cases: downward transformation of object and downward transformation of pointer. Downward transformation of object will trigger the process of object construction, while downward transformation of pointer will not trigger the process of object construction. Objects are cast down to suppress the extra call loss associated with polymorphism, while Pointers are cast down to represent C++ polymorphism.

The following routine is used as an example to debug the object’s downward transition:

class A {
public:
    int a = 0x1;
    virtual void vfunc1(a) {}; };class B : virtual public A{
public:
    int b = 0x2;
    void vfunc1(a) {}; };void main(a) {
    B b;
    B *bptr = &b;
    
    // The object is transformed downward
    A a = (A)b;
}
Copy the code

In code, B goes down to A, but a and B are not the same object, and they don’t even have any overlap in memory space. The debugging process is as follows:

(lldb) p/x &a
(A *) $3 = 0x00007ffeee45cfe0
(lldb) p/x &b
(B *) $4 = 0x00007ffeee45cff0
(lldb) p sizeof(a)
(unsigned long) A $5 = 16
(lldb) p sizeof(b)
(unsigned long) $6 = 32
(lldb) x/2g 0x00007ffeee45cfe0
0x7ffeee45cfe0: 0x00000001017a70d8 0x0000000000000001
(lldb) x/4g 0x00007ffeee45cff0
0x7ffeee45cff0: 0x00000001017a7058 0x0000000000000002
0x7ffeee45d000: 0x00000001017a7078 0x0000000000000001
Copy the code

Modify the debug code a little bit and the debug pointer goes down:

void main(a) {
    B b;
    B *bptr = &b;
    
    // Turn the pointer down
    A *aptr = (A *)bptr;
}
Copy the code

The code is short, but contains a lot of operations. Where A * APtr = (A *) BPTR corresponding assembly instruction starts with the following code address 0x10526bd8D, i.e. Movq -0x28(% RBP), % RDX. A * APtr = (A *) BPTR A *aptr = (A *) BPTR b-> AVptr So here’s the question:

  • Problem a:A *aptr = (A *)bptrAfter execution, the source codeaptrThe value of theta is going to be thetab->avptrThe address of? Apparently not. In the debuggingp/x aptrandp/x bptrYou have to get the same value (bObject address), while the “object is computed at the assembly levelbIn theAThe type corresponds to the starting address of the memory layout space “is used when source code uses such asaptr->xxxWhen accessing a member or function, use this address directly as the object address;
  • Problem two: since useaptr->xxxWhen accessing a member or function, “objectbIn theAType corresponding to the memory layout space start address “directly as the object address, thenaptr->vfun1The trigger is going to beAType defined byvfunc1? Obviously not, and if it were, it would contradict the C++ virtualization mechanism. You have to be specificb->avptrandAOf a type instancevptrInstead of pointing to the same virtual table, the compiler virtualization mechanism takes care of this at compile time.
0x10526bd70 <+0>:  pushq  %rbp
0x10526bd71 <+1>:  movq   %rsp, %rbp
0x10526bd74 <+4>:  subq   $0x40, %rsp
0x10526bd78 <+8>:  leaq   -0x20(%rbp), %rdi
0x10526bd7c <+12>: callq  0x10526bdd0               ; B::B at ViewController.mm:16
0x10526bd81 <+17>: xorl   %eax, %eax
0x10526bd83 <+19>: movl   %eax, %ecx
0x10526bd85 <+21>: leaq   -0x20(%rbp), %rdx         ; 
0x10526bd89 <+25>: movq   %rdx, -0x28(%rbp)         ; 
0x10526bd8d <+29>: movq   -0x28(%rbp), %rdx ; The address of object B is written to $RDX0x10526bd91 <+33>: cmpq   $0x0, %rdx                ;
0x10526bd95 <+37>: movq   %rdx, -0x38(%rbp) ; The address of object B is written-0x38(%rbp)
0x10526bd99 <+41>: movq   %rcx, -0x40(%rbp)         ; -0x40For RBP (%)0
0x10526bd9d <+45>: je     0x10526bdb5               ; <+69> at ViewController.mm
0x10526bda3 <+51>: movq   -0x38(%rbp), %rax ; The address of object B is written to $rax0x10526bda7 <+55>: movq (%rax), %rcx ; B - > VPTR write $RCX0x10526bdaa <+58>: movq   -0x18(%rcx), %rcx ; B ->avptr writes $RCX to the offset in object B0x10526bdae <+62>: addq %rcx, %rax ; Compute the address of b->avptr in object B and write $rax0x10526bdb1 <+65>: movq   %rax, -0x40(%rbp) ; B -> the address of avptr is written-0x40(%rbp)
0x10526bdb5 <+69>: movq   -0x40(%rbp), %rax ; Write the address of b->avptr to $rax0x10526bdb9 <+73>: movq   %rax, -0x30(%rbp) ; B -> the address of avptr is written-0x30(%rbp)
0x10526bdbd <+77>: addq   $0x40, %rsp
0x10526bdc1 <+81>: popq   %rbp
0x10526bdc2 <+82>: retq   
Copy the code

So what are the benefits of going down? Can be combined with the two other code in the debug code, the first line of code corresponding to the three assembly instruction, the second sentence code corresponding to the five assembly instruction, thus downward transition can reduce polymorphism/base class member/introduced by inheritance of virtual function to access the indirect degrees, or more accurately, to raise some indirect in downward transition and former united did.

int xx = aptr->a;
int yy = bptr->a;
Copy the code
0x10370ad9d <+77>:  movq   -0x30(%rbp), %rax         ; int xx = aptr->a
0x10370ada1 <+81>:  movl   0x8(%rax), %ecx
0x10370ada4 <+84>:  movl   %ecx, -0x34(%rbp)
0x10370ada7 <+87>:  movq   -0x28(%rbp), %rax         ; int yy = bptr->a
0x10370adab <+91>:  movq   (%rax), %rdx
0x10370adae <+94>:  movq   -0x18(%rdx), %rdx
0x10370adb2 <+98>:  movl   0x8(%rax,%rdx), %ecx
0x10370adb6 <+102>: movl   %ecx, -0x38(%rbp)
Copy the code

However, for virtual function access, the complexity is the same before and after the downward transition. Replace the above two statements with the following two statements, and post the corresponding assembly code, it is obvious that the instructions are basically the same, the only difference is that the function is passed the this pointer (object address) parameter.

aptr->vfunc1(a); bptr->vfunc1(a);Copy the code
0x10e58cd9d <+77>:  movq   -0x30(%rbp), %rax        ; aptr->vfunc1();
0x10e58cda1 <+81>:  movq   (%rax), %rcx
0x10e58cda4 <+84>:  movq   %rax, %rdi
0x10e58cda7 <+87>:  callq  *(%rcx)
0x10e58cda9 <+89>:  movq   -0x28(%rbp), %rax        ; bptr->vfunc1();
0x10e58cdad <+93>:  movq   (%rax), %rcx
0x10e58cdb0 <+96>:  movq   %rax, %rdi
0x10e58cdb3 <+99>:  callq  *(%rcx)
Copy the code

7.3.2 dynamic cast

Dynamic cast can be understood as supporting safe upward transitions. The previous downward transformation is derived class to base class, while Dynamic cast can be used for base class to derived class conversion. Following the inheritance structure in 7.3.1 routines, the debugging code is replaced with the following code. It is clear that the real object aptr points to in the following code is of type A, so the BPTR value obtained in the following code is 0. How does the C++ runtime know that aptr cannot be converted to type B *? Is achieved by maintaining all types of TypeInfo.

NOTE: Unlike downward casting, Dynamic Cast only supports conversions to object Pointers or reference types. In addition, object pointer or reference type conversions, whether downcast or dynamic cast, do not change any data on the object to which the pointer points.

void main(a) {
    A a;
    A *aptr = &a;
    B *bptr = dynamic_cast<B *>(aptr);
}
Copy the code

Take a quick look at the corresponding assembly code, focusing on the nine lines 0x10FDCDdeb through 0x10Fdcde12. This is done by calling the C++ runtime library’s __dynamic_cast function, passing in typeinfo of type B and type A, and writing the result to the rax register. Typeinfo of type B and type A is stored in the data segment of the program, so fetching typeInfo code is the same as fetching global and static variables.

0x10fdcddc0 <+0>:   pushq  %rbp
0x10fdcddc1 <+1>:   movq   %rsp, %rbp
0x10fdcddc4 <+4>:   subq   $0x30, %rsp
0x10fdcddc8 <+8>:   leaq   -0x10(%rbp), %rdi
0x10fdcddcc <+12>:  callq  0x10fdcde40               ; A::A at ViewController.mm:10
0x10fdcddd1 <+17>:  leaq   -0x10(%rbp), %rax
0x10fdcddd5 <+21>:  movq   %rax, -0x18(%rbp)
0x10fdcddd9 <+25>:  movq   -0x18(%rbp), %rax
0x10fdcdddd <+29>:  cmpq   $0x0, %rax
0x10fdcdde1 <+33>:  movq   %rax, -0x28(%rbp)
0x10fdcdde5 <+37>:  je     0x10fdcde1b               ; <+91> at ViewController.mm
0x10fdcddeb <+43>:  movq   0x220e(%rip), %rax        ; (void *)0x000000010fdd0030: typeinfo for A
0x10fdcddf2 <+50>:  movq   0x220f(%rip), %rcx        ; (void *)0x000000010fdd0040: typeinfo for B
0x10fdcddf9 <+57>:  movq   -0x28(%rbp), %rdx
0x10fdcddfd <+61>:  movq   %rdx, %rdi
0x10fdcde00 <+64>:  movq   %rax, %rsi
0x10fdcde03 <+67>:  movq   %rcx, %rdx
0x10fdcde06 <+70>:  movq   $-0x1, %rcx
0x10fdcde0d <+77>:  callq  0x10fdce426               ; symbol stub for: __dynamic_cast
0x10fdcde12 <+82>:  movq   %rax, -0x30(%rbp)
0x10fdcde16 <+86>:  jmp    0x10fdcde28               ; <+104> at ViewController.mm
0x10fdcde1b <+91>:  xorl   %eax, %eax
0x10fdcde1d <+93>:  movl   %eax, %ecx
0x10fdcde1f <+95>:  movq   %rcx, -0x30(%rbp)
0x10fdcde23 <+99>:  jmp    0x10fdcde28               ; <+104> at ViewController.mm
0x10fdcde28 <+104>: movq   -0x30(%rbp), %rax
0x10fdcde2c <+108>: movq   %rax, -0x20(%rbp)
0x10fdcde30 <+112>: addq   $0x30, %rsp
0x10fdcde34 <+116>: popq   %rbp
0x10fdcde35 <+117>: retq 
Copy the code

Finally, a brief look at some of the data that was fished out during debugging:

(lldb) po (void(*)(void))$rax (CppDemo`typeinfo for A) (lldb) po (void(*)(void))$rcx (CppDemo`typeinfo for B) (lldb) p/x  $rax (unsigned long) $4 = 0x000000010fdd0030 (lldb) p/x $rcx (unsigned long) $5 = 0x000000010fdd0040 (lldb) x/2g 0x000000010fdd0030 0x10fdd0030: 0x0000000110ec00e8 0x000000010fdce4e4 (lldb) po (void(*)(void))0x0000000110ec00e8 (libc++abi.dylib`vtable for __cxxabiv1::__class_type_info + 16) (lldb) po (void(*)(void))0x000000010fdce4e4 (CppDemo`typeinfo name for A) (lldb) po (char *)0x000000010fdce4e4 "1A" (lldb) x/2g 0x000000010fdd0040 0x10fdd0040: 0x0000000110ec01b8 0x000000010fdce4e7 (lldb) po (void(*)(void))0x0000000110ec01b8 (libc++abi.dylib`vtable for __cxxabiv1::__vmi_class_type_info + 16) (lldb) po (void(*)(void))0x000000010fdce4e7 (CppDemo`typeinfo name for B) (lldb)  po (char *)0x000000010fdce4e7 "1B"Copy the code

Finally, what does the compiler do if you use dynamic cast to do a downward conversion? Unbeknown to other compilers, the Clang compiler will process this as a downward transition, so the assembly instructions for the following code are exactly the same as the assembly instructions for the original debugging code in 7.3.1. As a result, the C++ runtime does not need to go through the __dynamic_cast call process, which is naturally more efficient.

B b;
B *bptr = &b;
A *aptr = dynamic_cast<B *>(bptr);
Copy the code

This section mainly analyzes assembly code through code debugging and explores the type conversion mechanism of C++ in combination with the content in the book. In this paper, the Clang compiler is used for debugging, and the conclusions are not consistent with those in the book. Finally, A * A = static_cast(BPTR) and A * A = (A *) BPTR are equivalent to the Clang compiler, so static casts are cast.

Think: Is it safe for the Clang compiler to transition dynamic cast processing down to static cast? Presumably, since the only channel for upward casting is dynamic cast, which can be converted to 0 if it fails, and 0 can be converted to any type of pointer, and the compiler can block upward casting using cast syntax, For example, B *baptr = (B *)&a must trigger a compilation error, so it should be safe to transition the dynamic cast processing down. However, just because a pointer with a value of 0 uses the -> operator does not mean that it is safe to use it does not mean that it is safe to use it. Therefore, be especially careful when using dynamic cast upcasts.

7.3.3 References and Pointers

Pointers and references are essentially memory addresses of objects. Pointers are more flexible in use. They can be bound to any object at any time, and can even be set to 0. Pointers can be initialized or not initialized when declared. A reference must be initialized when it is declared and cannot be initialized to 0. Once a reference is declared, it cannot be bound to other objects. Or, more precisely, a reference is always bound to a fixed chunk of memory for its lifetime. To put it mildly, Pointers are like men who cheat on women’s affections and can switch partners at any time.

References and Pointers can be cast using cast syntax, static casts, and dynamic casts. Analyze the following piece of code that looks awkward. First, will apTR and AREf still point to real A objects after the code is executed? Yes, the following code operates directly on the memory space of the object A, so apTR and AREf are the address of the object A. So what does *aptr = B() and aref = B() do? Create a temporary object of type B; create a temporary object of type B; 2. Call the assignment constructor of type A to copy the temporary object’s data to A. Therefore, subsequent calls to either aptr->vfunc1() or aref.vfunc1() are actually vfun1 defined by class A.

void main(a) {
    A a;
    A &aref = a;
    A *aptr = &a;

    *aptr = B(a); aref =B(a); }Copy the code

Since references cannot be initialized to 0 or assigned to 0, libc++abi.dylib will be triggered at runtime if 0 is returned when a dynamic cast is referenced: Terminating with uncaught exception of type STD :: BAD_cast: STD ::bad_cast: STD ::bad_cast

7.3.4 TypeID operator

Typeid is the operator exposed by < TypeInfo > to get the specific type of an object. Use as follows:

#include <typeinfo>.void main(a) {
    A a;
    A *aptr = &a;
    const char *aptrName = typeid(aptr).name(a); }Copy the code

The corresponding assembly code is as follows. So the compiler works with typeID operators like this:

  • Get the corresponding of the type from the data segmenttype_infoObject;
  • Call according to semanticstype_infoThe corresponding function of returns target information;
0x10078cde0 <+0>:  pushq  %rbp
0x10078cde1 <+1>:  movq   %rsp, %rbp
0x10078cde4 <+4>:  subq   $0x20, %rsp
0x10078cde8 <+8>:  leaq   -0x10(%rbp), %rdi
0x10078cdec <+12>: callq  0x10078ce20               ; A::A at ViewController.mm:11
0x10078cdf1 <+17>: movq   0x2208(%rip), %rax        ; (void *)0x000000010078f038: typeinfo for A*
0x10078cdf8 <+24>: leaq   -0x10(%rbp), %rcx
0x10078cdfc <+28>: movq   %rcx, -0x18(%rbp)
0x10078ce00 <+32>: movq   %rax, %rdi
0x10078ce03 <+35>: callq  0x10078ce40               ; std::type_info::name at typeinfo:296
0x10078ce08 <+40>: movq   %rax, -0x20(%rbp)
0x10078ce0c <+44>: addq   $0x20, %rsp
0x10078ce10 <+48>: popq   %rbp
0x10078ce11 <+49>: retq   
Copy the code

conclusion

First brush this book, in C++ code, and before the feeling is not quite the same. Before, writing C++ code was like writing in a fog and then implementing the function. Now, when you write code, you naturally think about what the compiler and runtime are doing behind each line of code and wonder if there is a better way to write it. So the foundation is still very important, or that sentence: if the foundation is not firm, the earth will shake. Only when the bottom is salvaged can the code really stand up to scrutiny.