C++ virtual table parsing

I found that this article is very clear about the virtual function table, but the original format looks not very friendly, so I adjusted the format, reproduced here, convenient to view.

taking

blog.csdn.net/haoel

preface

The function of virtual function in C++ is to realize the mechanism of polymorphism.

With polymorphism, the simple idea is to use a pointer of the parent type to point to an instance of its child class, and then call a member function of the actual child class through the pointer of the parent class. This technique allows the pointer of the parent class to have “multiple shapes,” which is a generic technique.

The generic technique is simply an attempt to implement mutable algorithms using immutable code. For example, template technology, RTTI technology, and virtual technology either try to make compile-time decisions or try to make run-time decisions.

On the use of virtual functions, I do not do too much elaboration here. You can look at the related C++ books. In this article, I just want to give you a clear analysis of the implementation mechanism of virtual functions.

Of course, some of the same articles have appeared on the Internet, but I always feel that these articles are not very easy to read, large sections of code, no pictures, no detailed instructions, no comparison, no drawing inferences. Not good for learning and reading, so that’s why I want to write this article. I hope you can give me more advice.

Without further ado, let’s enter the world of virtual functions.

Virtual function table

Anyone who knows C++ should know that Virtual functions are implemented using a Virtual Table. V-table for short. In this table, the main is to want a class virtual function address table, this table solves the problem of inheritance, overwrite, ensure that its capacity to reflect the actual function. Thus, the table is allocated in the memory of the instance of the class that has a virtual function, so when we use a pointer to the parent class to operate on a subclass, the virtual table becomes important as a map of the actual functions that should be called.

Let’s take a look at the virtual table here. The C++ compiler should ensure that the pointer to the vtable exists at the top of the object instance (this is to ensure the highest performance of the vtable if there are multiple layers of inheritance or multiple inheritance). This means that we get the vtable from the address of the object instance, and then we can walk through the function pointer and call the corresponding function.

From what I’ve been saying, I can tell you’re probably a little more giddy than you used to be. It doesn’t matter, the following is a practical example, believe that smart you see it.

Suppose we have a class that looks like this:

class Base {
     public:
            virtual void f(a) { cout << "Base::f" << endl; }
            virtual void g(a) { cout << "Base::g" << endl; }
            virtual void h(a) { cout << "Base::h"<< endl; }};Copy the code

As stated above, we can derive the vtable from an instance of Base. Here’s the actual routine:

typedef void(*Fun)(void);

 Base b;
 Fun pFun = NULL;
 cout << "Virtual table address:"< < (int*)(&b) << endl;
 cout << "Virtual table - first function address:"< < (int* (*)int*)(&b) << endl;
 // Invoke the first virtual function 
 pFun = (Fun)*((int* (*)int*)(&b));
 pFun(a);Copy the code

Windows XP+VS2003, Linux 2.6.22 + GCC 4.1.3

Vtable address:0012FED4 vtable -- first function address:0044F148
Base::f
Copy the code

As we can see from this example, we can get the address of the vtable by forcing &b to int, and then address it again to get the address of the first vtable, Base::f(), which is validated in the program above (forcing int to a function pointer). To call Base::g() and Base::h(), we can use the following code:

(Fun)*((int* (*)int*)(&b)+0);  // Base::f()
(Fun)*((int* (*)int*)(&b)+1);  // Base::g()
(Fun)*((int* (*)int*)(&b)+2);  // Base::h()
Copy the code

You get the idea at this point. What? Still a little dizzy. Again, this code looks messy. No problem. Let me draw a picture to explain. As follows:

Note: In the figure above, I have added an extra node at the end of the virtual table. This is the end node of the virtual table, just like the string terminator “/0”, which marks the end of the virtual table. The value of this closing flag varies from compiler to compiler. Under WinXP+VS2003, this value is NULL. Under Ubuntu 7.10 + Linux 2.6.22 + GCC 4.1.3, the value is 1, indicating the next vtable, and 0, indicating the last vtable.

Next, I’ll show you what the virtual table looks like with “no coverage” and “with coverage.” A virtual function that does not override a parent class is meaningless. The main reason I want to talk about the absence of coverage is to give a contrast. In comparison, we can get a clearer idea of its internal implementation.

General inheritance (no virtual override)

Now, let’s take a look at what the virtual table looks like under inheritance. Assume an inheritance relationship that looks like this:

Note that in this inheritance relationship, the child class does not override any of the functions of the parent class. So, in an instance of a derived class, its virtual table looks like this:

For instance: Derive D; The virtual function table of:

We can see the following points:

  1. The virtual functions are placed in the table in the order they are declared.
  2. The virtual function of the parent class precedes the virtual function of the child class.

I am sure you can refer to the previous program, to write a program to verify.

Generic inheritance (with virtual override)

Overwriting the virtual function of the parent class is obvious; otherwise, the virtual function becomes meaningless. Now, let’s see what happens if a virtual function in a subclass overloads a virtual function of the parent class. Suppose we have an inheritance relationship that looks like this.

To show you the effect of inheritance, I’ve only overridden one function of the parent class, f(), in the design of this class. So, for an instance of a derived class, the virtual table would look like this:

As we can see from the table,

  1. The overridden f() function is placed in the virtual table where the original parent virtual function was.
  2. Functions that are not covered remain intact.

So, we can see that for a program like this,

Base *b = new Derive(a); b->f(a);Copy the code

The location of f() in the memory virtual table indicated by b has been replaced by the address of Derive::f(), so that Derive::f() is called when the actual call occurs. This is what makes polymorphism possible.

Multiple inheritance (no virtual override)

Next, let’s take a look at multiple inheritance, assuming the following class inheritance relationship. Note: The subclass does not override the function of the parent class.

For a virtual table in a subclass instance, it looks like this:

We can see that:

  1. Each parent class has its own virtual table.
  2. The member functions of the subclass are placed in the table of the first parent class. (The so-called first superclass is determined in declaration order)

This is done to solve the problem that different superclass types have Pointers to the same subclass instance that can call the actual function.

Multiple inheritance (with virtual override)

Now let’s take a look at what happens when a virtual override occurs.

In the figure below, we override the parent f() function in a subclass.

Here is a diagram for the virtual table in a subclass instance:

We can see that the position of f() in the virtual table of the three parent classes is replaced by the function pointer of the child class. In this way, we can point to any statically typed superclass and call f() of the subclass. Such as:

Derive d;

Base1 *b1 = &d;
Base2 *b2 = &d;
Base3 *b3 = &d;

b1->f(a);//Derive::f()
b2->f(a);//Derive::f()
b3->f(a);//Derive::f()


b1->g(a);//Base1::g()
b2->g(a);//Base2::g()
b3->g(a);//Base3::g()
Copy the code

security

Every time I write a C++ article, I have to criticize C++. This article is no exception. Through the above, I believe we have a more detailed understanding of the virtual table. Water can carry a boat and it can also overturn it. Now, let’s see what bad things we can do with virtual tables.

Access a subclass’s own virtual function via a pointer to the parent type

As we know, it makes no sense for a child class not to override a parent class’s virtual function. Because polymorphism is also based on function overloading. Although in the figure above we can see that the virtual table of Base1 has a virtual function Derive, it is impossible to call a subclass’s own virtual function using the following statement:

Base1 *b1 = new Derive(a); b1->f1(a);// Compile error
Copy the code

Any attempt to use a pointer to a parent class to call a member function in a child class that does not cover the parent class is considered illegal by the compiler, so such a program will never compile. At run time, however, we can use Pointers to access virtual tables to achieve behavior that violates C++ semantics. (For an attempt at this, read the code in the appendix below. I’m sure you can.)

Access non-public virtual functions

In addition, if the parent virtual is private or protected, the non-public virtual functions will also exist in the vtable, so it is easy to access the non-public virtual functions in the same way as the vtable.

Such as:

class Base {
    private:
       virtual void f(a) { cout << "Base::f"<< endl; }};class Derive : public Base{

};

typedef void(*Fun)(void);

void main(a) {
    Derive d;
    Fun  pFun = (Fun)*((int* (*)int*)(&d)+0);
    pFun(a); }Copy the code

conclusion

C++ is a Magic language, and to programmers, we never seem to know what the language is doing behind our backs. To be familiar with the language, we need to understand the things in C++, and we need to understand the dangerous things in C++. Otherwise, it’s a programming language that shoots itself in the foot.

Before the article, let’s introduce ourselves. I have been engaged in software development for ten years, and now I am the technical director of software development. In terms of technology, I mainly attack Unix/C/C++. I like the technology on the network, such as distributed computing, grid computing, P2P, Ajax and everything related to the Internet. In terms of management, I am good at team building, technical trend analysis and project management. Welcome to communicate with me, my MSN and Email is: [email protected]

Appendix I: Viewing the vtable in VC

We can see the vtable by expanding the instance of the class in Debug state in the VC IDE environment (not complete).

Appendix II: Routines

Here is a routine for virtual table access with multiple inheritance:

#include <iostream>
using namespace std;

class Base1 {
public:
      virtual void f(a) { cout << "Base1::f" << endl; }
      virtual void g(a) { cout << "Base1::g" << endl; }
      virtual void h(a) { cout << "Base1::h"<< endl; }};class Base2 {
public:
       virtual void f(a) { cout << "Base2::f" << endl; }
       virtual void g(a) { cout << "Base2::g" << endl; }
       virtual void h(a) { cout << "Base2::h"<< endl; }};class Base3 {
public:
       virtual void f(a) { cout << "Base3::f" << endl; }
       virtual void g(a) { cout << "Base3::g" << endl; }
       virtual void h(a) { cout << "Base3::h"<< endl; }};class Derive : public Base1, public Base2, public Base3 {

public:
       virtual void f(a) { cout << "Derive::f" << endl; }
       virtual void g1(a) { cout << "Derive::g1"<< endl; }};typedef void(*Fun)(void);

int main(a)
{
       Fun pFun = NULL;
       Derive d;

       int** pVtab = (int**)&d;
       //Base1's vtable
       //pFun = (Fun)*((int*)*(int*)((int*)&d+0)+0);
       pFun = (Fun)pVtab[0] [0];
       pFun(a);//pFun = (Fun)*((int*)*(int*)((int*)&d+0)+1);
       pFun = (Fun)pVtab[0] [1];
       pFun(a);//pFun = (Fun)*((int*)*(int*)((int*)&d+0)+2);
       pFun = (Fun)pVtab[0] [2];
       pFun(a);//Derive's vtable
       //pFun = (Fun)*((int*)*(int*)((int*)&d+0)+3);
       pFun = (Fun)pVtab[0] [3];
       pFun(a);//The tail of the vtable
       pFun = (Fun)pVtab[0] [4];
       cout<<pFun<<endl;

       //Base2's vtable
       //pFun = (Fun)*((int*)*(int*)((int*)&d+1)+0);
       pFun = (Fun)pVtab[1] [0];
       pFun(a);//pFun = (Fun)*((int*)*(int*)((int*)&d+1)+1);
       pFun = (Fun)pVtab[1] [1];
       pFun(a); pFun = (Fun)pVtab[1] [2];
       pFun(a);//The tail of the vtable
       pFun = (Fun)pVtab[1] [3];
       cout<<pFun<<endl;


       //Base3's vtable
       //pFun = (Fun)*((int*)*(int*)((int*)&d+1)+0);
       pFun = (Fun)pVtab[2] [0];
       pFun(a);//pFun = (Fun)*((int*)*(int*)((int*)&d+1)+1);
       pFun = (Fun)pVtab[2] [1];
       pFun(a); pFun = (Fun)pVtab[2] [2];
       pFun(a);//The tail of the vtable
       pFun = (Fun)pVtab[2] [3];
       cout<<pFun<<endl;

       return 0;
}
Copy the code

Please indicate author and source when reprinting. Do not use for commercial purposes without permission)

Copyright Notice: This article is the original article of CSDN blogger “Haoel”, and it complies with the CC 4.0 BY-SA copyright agreement. Please attach the link of the original source and this statement. The original link: blog.csdn.net/haoel/artic…