This article provides an in-depth analysis of the memory layout and implementation principles of polymorphic, virtual and multiple inheritance.

First, take a look at the mind map:

The following is a step – by – step analysis of the outline.

What is the memory layout without virtual functions

1. Memory layout of a class without virtual functions

When a class has no virtual function, it is a structure, and its memory layout is based on the order of its member variables.

Look at the following code:

#include <iostream>
using namespace std;

class CPeople
{
	double height;
	int age;
	char sex;
public:
	CPeople{} ~ ()CPeople(){}
};

int main(a)
{
	CPeople people;
	return 0;
}
Copy the code

By default, you will use GDB to set the print format, and then look at the memory layout and size of the people object as follows:

(gdb) set p pretty on
(gdb) p people
$6 = {Height = 0, age = 0, sex = 0' \000'} (GDB) p sizeof(people)$7 = 16
(gdb)
Copy the code

There are no virtual functions, and the class CPeople is a structure, with a size aligned to 8 bytes.

2. Memory layout of derived classes without virtual functions

Add a derived CSon class as follows:

#include <iostream>
using namespace std;

class CPeople
{
	double height;
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople(){}
};

class CSon: CPeople
{
	int sisters;
public:
	CSon{} ~ ()CSon(){}
};

int main(a)
{
	CSon son;
	return 0;
}
Copy the code

The memory layout and size of son are as follows:

(gdb) p son
$1 = {<CPeople> = {height = 2, age = 2, age = 3, sex = 5 '6'}, members of CSon: sisters = 4196224 } (gdb) p sizeof(son)$2 = 24
Copy the code

To put it bluntly, this is a structure like the following:

struct a
{
	struct b
	{
		double h;
		int a;
		char s;
	}bbb;
	int s;
};
Copy the code

Without virtual functions, there are no issues related to virtual function tables and virtual table Pointers, so it is relatively simple.

What is the layout of memory with virtual functions

1. Memory layout for classes with virtual functions

Let’s start with a single class that contains virtual functions as follows:

#include <iostream>
using namespace std;

class CPeople
{
public:
	double height;// Set this to a public member variable for viewing the address
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople() {}virtual void set(a){}};int main(a)
{
	CPeople people;
	return 0;
}
Copy the code

Again, use GDB to view the memory layout as follows:

(gdb) p people
$1 = {
  _vptr.CPeople = 0x4008e0 <vtable for CPeople+16>, 
  height = 1.1659688840009374e-312, 
  age = 4196320, 
  sex = 0 '\000'
}
(gdb) p sizeof(people)
$2 = 24
(gdb) p &people
$3 = (CPeople *) 0x7fffffffe810
(gdb) p &people.height
$4 = (double *) 0x7fffffffe818
(gdb) p &people.age
$5 = (int *) 0x7fffffffe820
(gdb) p &people.sex
$6 = 0x7fffffffe824 ""
(gdb)
Copy the code

CPeople = 0x4008e0

VPTR = vtable for CPeople+16> The object is incremented by a virtual pointer.

Virtual function class exist in all things, the object will generate a virtual table pointer, and the virtual table pointer objects stored in the memory of the beginning, which is first generated virtual table pointer, and then assign a member variable space, a virtual table pointer takes up size is related to the operating system, I here is a 64 – bit system, So the virtual table pointer takes up 8 bytes in this case.

Then use CPeople to generate a derived class CSon, but without implementing the same virtual function, and see what it looks like:

#include <iostream>
using namespace std;

class CPeople
{
public:
	double height;
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople() {}virtual void set(a){}};class CSon:public CPeople
{
public:
	int sisters;
//	void set(){}
};

int main(a)
{
	CSon son;
	return 0;
}
Copy the code

GDB displays the memory layout as follows:

(gdb) p son
$1 = {<CPeople> = {_vptr.CPeople = 0x400990 <vtable for CSon+16>, height = 1.1659688840009374E-312, age = 4196496, sex = 0 '\000' }, members of CSon: sisters = 0 } (gdb)Copy the code

Now, for the derived object, it is the same as before when there was no virtual function, just add the member variable of the derived class to the base class, and then implement the same virtual function of the base class to see what happens.

2. Principle of polymorphism

A derived class implements the same virtual function as the base class, which is essentially a polymorphic operation.

#include <iostream>
using namespace std;

class CPeople
{
public:
	double height;
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople() {}virtual void set(a){}};class CSon:public CPeople
{
public:
	int sisters;
	virtual void set(a){}};int main(a)
{
	CSon son;
	return 0;
}

Copy the code

Use GDB as follows:

(gdb) p son
$2 = (CSon) {<CPeople> = {_vptr.CPeople = 0x4009A0 <vtable for CSon+16>, height = 1.1659688840009374E-312, age = 4196512, sex = 0 '\000' }, members of CSon: sisters = 0 } (gdb) p /a *(void**)0x4009a0$5 = 0x40082a <CSon::set(a) >
Copy the code

Looks memory layout as before there is no difference between ha, derived classes did not regenerate the virtual table pointer, directly inherited from the base class virtual table pointer, but the second printing we can see from the GDB, according to the virtual function table pointer find virtual function table, we see a virtual function table containing virtual functions is derived class.

In ordinary inheritance (non-virtual inheritance), the derived class does not regenerate the virtual table pointer, but overwrites the same virtual function of the base class with its own virtual function address. If the virtual function is unique to the derived class, it is directly appended to the end of the virtual table.

The following is a real implementation of a polymorphic, using a parent pointer to generate a derived object, see how it looks, the code is as follows:

#include <iostream>
using namespace std;

class CPeople
{
public:
	double height;
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople() {}virtual void set(a){}};class CSon:public CPeople
{
public:
	int sisters;
	virtual void set(a){}
	virtual void get(a){}};int main(a)
{
	CPeople *son = new CSon(a);if( son ! =nullptr )
	{
		delete son;
		son = nullptr;
	}
	return 0;
}
Copy the code

Use GDB to view *son’s memory layout as follows:

(gdb) p *son
$2 = (CSon) {
  <CPeople> = {
    _vptr.CPeople = 0x400a90 <vtable for CSon+16>, 
    height = 0, 
    age = 0, 
    sex = 0 '\ 000'
  }, 
  members of CSon: 
  sisters = 0
}
(gdb) p /a *(void* *)0x400a90
$3 = 0x400938 <CSon::set()>
(gdb) p /a *(void* *)0x400a90@2
$4 = {0x400938 <CSon::set(a) >,0x400944 <CSon::get(a) >}Copy the code

As you can see, the memory layout is the same as when you use a derived object directly, except that when you use a derived object directly, it is determined at compile time whether to call the virtual function of the base class or the derived class. When you use a base pointer, it is determined at run time.

To sum up: Polymorphism of c + + inheritance generally refers to the runtime polymorphism, using a base class pointer or reference to a derived class object, in the case of the virtual inheritance, virtual table of a derived class directly inherited base class pointer, and then use the virtual functions of derived class to override a base class virtual function, so derived class objects through the virtual table pointer to access virtual functions is derived class virtual functions.

GDB = GDB; GDB = GDB;

(gdb) p son
$2 = (CSon *) 0x613c20
(gdb) p &son->height
$3 = (double *) 0x613c28
(gdb) p &son->sisters
$4 = (int *) 0x613c38
Copy the code

Look out object pointer refers to a block of memory, is the first address 0 x613c20, occupies the first 8 bytes, then the virtual table pointer and then in turn according to the base class and derived the order of the class declaration member variables to store, that is, the virtual inheritance when memory is the sequence of statements in accordance with the order class hierarchy and member variables to store, base class before, and derived class behind.

Three, virtual inheritance

If you look closely, you can see that I have emphasized non-virtual inheritance several times before, because virtual inheritance does not matter much if there is no virtual function, but virtual inheritance and non-virtual inheritance are different when there is a virtual function, as follows:

#include <iostream>
using namespace std;

class CPeople
{
public:
	double height;
	int age;
	char sex;

public:
	CPeople{} ~ ()CPeople() {}virtual void set(a){}};class CSon:virtual public CPeople
{
public:
	int sisters;
	virtual void set(a){}
	virtual void get(a){}};int main(a)
{
	CPeople *son = new CSon(a);if( son ! =nullptr )
	{
		delete son;
		son = nullptr;
	}
	return 0;
}
Copy the code

Also using GDB debugging, print out the memory layout as follows:

(gdb) p *son
$1 = (CSon) {
  <CPeople> = {
    _vptr.CPeople = 0x400b00 <vtable for CSon+64>, 
    height = 0, 
    age = 0, 
    sex = 0 '\000'
  }, 
  members of CSon: 
  _vptr.CSon = 0x400ad8 <vtable for CSon+24>, 
  sisters = 0
}
(gdb) p /a *(void**)0x400ad8@2
$4 = {0x40095a <CSon::set()>, 0x40096e <CSon::get()>}
Copy the code

The address of the virtual table pointer of the derived class is not the same as that of the base class.

This indicates that virtual inheritance not only implements the derived class’s own pointer to the virtual table, but also regenerates its own virtual table. However, virtual inheritance is much more expensive than non-virtual inheritance, so it is best not to use virtual inheritance in most cases.

Going back to memory layout, in non-virtual inheritance, we also said that storage is sequential, but is virtual inheritance the same? Look at the data printed below:

(gdb) p son
$2 = (CSon *) 0x613c20
(gdb) p &son->height
$3 = (double *) 0x613c38
(gdb) p &son->sisters
$4 = (int *) 0x613c28
Copy the code

The virtual table pointer and member variables of the derived class come first, and the virtual table pointer and member variables of the base class come after, so why do we put the base class after?

So not only does virtual inheritance cost more, but it also changes the layout of memory, so why do we even have virtual inheritance? Let’s move on.

4. Multiple inheritance and ambiguity

Look at the following code that uses multiple inheritance:

#include <iostream>
using namespace std;

class A
{
public:
	A()
	{
		cout << "A()" << endl;
	}
	virtual ~A()
	{
		cout << "~A()"<< endl; }};class B: public A
{
public:
	B()
	{
		cout << "B()" << endl;
	}
	~B()
	{
		cout << "~B()"<< endl; }};class C: public A
{
public:
	C()
	{
		cout << "C()" << endl;
	}
	~C()
	{
		cout << "~C()"<< endl; }};class D:public B, public C
{
};

int main(a)
{
	D d;
	return 0;
}
Copy the code

The following output is displayed:

A()
B()
A()
C()
~C()
~A()
~B()
~A()
Copy the code

See no class A constructor and destructor are performed twice, this is obviously not right, because when performing A class B constructor performs A class A constructor, the implementation of class C and to perform A class A constructor, destructor in the same way, the problem here is not big, after all, you can compile and run.

Change the code as follows:

#include <iostream>
using namespace std;

class A
{
public:
	A()
	{
		cout << "A()" << endl;
	}
	void print(a)
	{
		cout << "print()" << endl;
	}
	virtual ~A()
	{
		cout << "~A()"<< endl; }};class B: public A
{
public:
	B()
	{
		cout << "B()" << endl;
	}
	~B()
	{
		cout << "~B()"<< endl; }};class C: public A
{
public:
	C()
	{
		cout << "C()" << endl;
	}
	~C()
	{
		cout << "~C()"<< endl; }};class D:public B, public C
{
};

int main(a)
{
	D d;
	d.print(a);return 0;
}
Copy the code

Error () :

Test. CPP :54:4: Error: ambiguous request for member 'print'Copy the code

Let’s comment out the line d.string () and look at the memory layout of object D as follows:

(gdb) p d
$1 = (D) {
  <B> = {
    <A> = {
      _vptr.A = 0x400fb8 <vtable for D+16>
    }, <No data fields>}, 
  <C> = {
    <A> = {
      _vptr.A = 0x400fd8 <vtable for D+48>
    }, <No data fields>}, <No data fields>}
(gdb)
Copy the code

Object D has two A’s in it, class B inherits one, and class C inherits the other, so there are two paths, and the compiler doesn’t know which way to go, so there’s an ambiguity.

The so-called ambiguity, in fact, is what we usually call the ambiguity problem, and how to solve the ambiguity problem? This answers the question in the previous chapter about the need for virtual inheritance.

Modify the code:

#include <iostream>
using namespace std;

class A
{
public:
	A()
	{
		cout << "A()" << endl;
	}
	void print(a)
	{
		cout << "print()" << endl;
	}
	virtual ~A()
	{
		cout << "~A()"<< endl; }};class B: virtual public A
{
public:
	B()
	{
		cout << "B()" << endl;
	}
	~B()
	{
		cout << "~B()"<< endl; }};class C: virtual public A
{
public:
	C()
	{
		cout << "C()" << endl;
	}
	~C()
	{
		cout << "~C()"<< endl; }};class D:public B, public C
{
};

int main(a)
{
	D d;
	d.print(a);return 0;
}
Copy the code

Class A is the virtual base class. Then look at the memory layout of object D as follows:

(gdb) p d
$1 = (D) {
  <B> = {
    <A> = {
      _vptr.A = 0x400fe0 <vtable for D+32>
    }, <No data fields>}, 
  <C> = {<No data fields>}, <No data fields>}
(gdb)
Copy the code

You can see that class D first inherits from class B and then C has no inheritance from class A, making the two paths one.

Someone will say, not to say that above virtual inheritance will regenerate the virtual table pointer, but this is A class B virtual inheritance class A, but class D when inheritance is A virtual inheritance, the class D will not regenerate the virtual table pointer, but class B and class C here should be to produce A virtual table pointer, GDB view did not, I also wondered at the beginning, but to the back and I understand.

Let’s add member variables to class A and see what they look like:

#include <iostream>
using namespace std;

class A
{
public:
	int a;
	A() {}virtual ~A(){}
};

class B: virtual public A
{
public:
	B{} ~ ()B(){}
};

class C: virtual public A
{
public:
	C{} ~ ()C(){}
};

class D:public B, public C
{
};

int main(a)
{
	D d;
	return 0;
}
Copy the code

This time, look at the memory layout as follows:

(gdb) p d
$1 = (D) {
  <B> = {
    <A> = {
      _vptr.A = 0x400c58 <vtable for D+104>, 
      a = 0
    }, 
    members of B: 
    _vptr.B = 0x400c08 <vtable for D+24>
  }, 
  <C> = {
    members of C: 
    _vptr.C = 0x400c30 <vtable for D+64>
  }, <No data fields>}
Copy the code

In the case of class A with member variables, both classes B and C regenerate virtual table Pointers and their own virtual tables. However, if we change the code and class A has no member variables, neither B nor C with member variables nor virtual functions will regenerate their own virtual table Pointers and virtual tables.

So one more thing we know here is that after virtual inheritance, a derived class generates its own virtual table and its own pointer to the virtual table. This is not entirely accurate. To be exact, when a virtual base class has a member variable, a derived class generates its own pointer to the virtual table and its own pointer to the virtual table. The derived class’s own virtual function table and virtual table pointer are not regenerated.

Add member variables to each of the four classes, and look at the layout of multiple inheritance. The code is as follows:

#include <iostream>
using namespace std;

class A
{
public:
	int a;
	A() {}virtual ~A(){}
};

class B: virtual public A
{
public:
	int b;
	B{} ~ ()B(){}
};

class C: virtual public A
{
public:
	int c;
	C{} ~ ()C(){}
};

class D:public B, public C
{
	public:
	int d;
};

int main(a)
{
	D d;
	return 0;
}
Copy the code

Note that multiple inheritance is not laid out in the order of declaration and inheritance. Its address distribution is as follows:

(gdb) p d
$1 = (D) {
  <B> = {
    <A> = {
      _vptr.A = 0x400c58 <vtable for D+104>, 
      a = 0
    }, 
    members of B: 
    _vptr.B = 0x400c08 <vtable for D+24>, 
    b = 4197175
  }, 
  <C> = {
    members of C: 
    _vptr.C = 0x400c30 <vtable for D+64>, 
    c = -228471872
  }, 
  members of D: 
  d = 54
}
(gdb) p &d
$2 = (D *) 0x7fffffffe800
(gdb) p &d.a
$3 = (int *) 0x7fffffffe828
(gdb) p &d.b
$4 = (int *) 0x7fffffffe808
(gdb) p &d.c
$5 = (int *) 0x7fffffffe818
(gdb) p &d.d
$6 = (int *) 0x7fffffffe81c
Copy the code

It can be seen from the address that for class B, C and D, it is stored in order. For class A, the virtual table pointer and member variables of the virtual base class are placed at the end of A block of memory, the same as the result of virtual inheritance in the previous chapter.

Personal Understanding: Virtual base class is on the back of the object’s memory, associated with the mechanism of virtual inheritance, we say that with the virtual inheritance, can guarantee the virtual base class in object in memory forever is only a copy, if still stored in sequence, is only a virtual base class, but it has more than one derived class, the compiler should put a virtual base class in front of which derived classes, palm of the hand are meat, If you can’t handle it, just put it at the back and share it, so there’s no fighting, and that explains why virtual inheritance can solve the ambiguity problem.

Five, the summary

Based on the above analysis, the following points are summarized:

  1. For a class without virtual functions, its size is actually the size of all its member variables. In this case, it is a structure composed of many member variables. When calculating the size, it is also calculated according to byte alignment.
  2. If a class without virtual functions is derived from a derived class without virtual functions, the memory layout of the derived class is a structure composed of base class member variables first and then derived class member variables. The member variables are stored in memory in the order they were declared.
  3. A virtual function in the class, the class itself will generate a virtual function table, the virtual function table is all class objects sharing, each class object is first generated when constructing a virtual table pointer, pointing to the virtual function table, then each member variables, so the class object has a virtual function than no virtual function class a virtual table pointer, Virtual table Pointers are no different from other Pointers, occupying 8 bytes on 64-bit systems;
  4. A derived class a virtual inheritance in a virtual function class, whether or not derived classes have the same virtual functions, its memory layout is just in a base class with virtual functions increase derived class member variables, based on the virtual table is directly inherited base class pointer, pointer to the base class virtual table, if a derived class have the same virtual functions, then cover the base class virtual function with the same name in the table, If the virtual function is unique to the derived class, append it to the base class virtual table.
  5. If a derived class inherits from a base class that has virtual functions and no member variables, then the derived class does not generate its own virtual table pointer and virtual function table. In this case, the memory layout is the virtual table pointer first and then the member variables of the derived class.
  6. A derived class inherits from a base class that has virtual functions and member variables. The derived class regenerates its own virtual table pointer and virtual function table. The memory layout is that the virtual table pointer and member variables of the derived class come first, and the virtual table pointer and member variables of the base class come second.
  7. It is better to use virtual inheritance when multiple inheritance, otherwise it will not only cause the headache of ambiguity, but also one more copy of the virtual base class. After using virtual inheritance, everyone shares the virtual base class, which saves space and avoids ambiguity.

Well, this article is for you to introduce here, if you think the content is useful, remember to click a like oh ~