New River, New Water

VTable with Example

VTable (Virtual Method Table) is the core mechanism for implmenting polymorphism in C++. After compiling, for each class implementing virtual functions (also including the base class even it includes a pure virtual function), there will be a VTable generated for it.

VTable is actually an array of entries of function location of certain class, the Address of VTable often means the first function. For base class and subclass, the implementations of virtual functions are in the same order, it choose correct function by move a specific offset.

There is also VPtr (V-pointer, that’s how I pronounce it). It is a pointer in each object/instance, the VPtr is supposed to be assigned to the value of the VTable in runtime, so we can do dynamic binding, a Person* p = new PoorPerson() knows that p should invoke functions in PoorPerson’s VTable.

According to [1] (this seems to be UB, so only for learning purpose), VPtr at the beginning of an object. That is, in Person* p = new PoorPerson(), pointer p is not only pointing to the beginning of a bunch of heap memory so called “an instance of PoorPerson”, but also pointing to the beginning of the VPtr of the instance. With this in mind, we can do some magic.

Considering following example:

#include <cstdio>
#include <string>

class Person {
public:
  virtual std::string getSalary() = 0;
  virtual std::string topics() = 0;
};

class PoorPerson : public Person {
public:
  std::string getSalary() override {
    return "12345";
  };
  std::string topics() override {
    return "Paying House Debt";
  };
};

class SFTechWorker : public Person {
public:
  std::string getSalary() override {
    return "123456";
  };
  std::string topics() override {
    return "Package;House;Children Education";
  };
};

int main() {
  Person* techWorker = new SFTechWorker();
  void** techWorkerVTable = reinterpret_cast<void**>(techWorker);

  Person* poor = new PoorPerson();
  void** poorObjAddr = reinterpret_cast<void**>(poor);
  *techWorkerVTable = *poorObjAddr;
  printf("%s\n", techWorker->topics().data());

  // *poorObjAddr is not only the poorObj itself,
  // but also the position of poorObj's vPtr.
  void** poorVTableAddr = reinterpret_cast<void**>(*poorObjAddr);
  void** shifted = poorVTableAddr-1;
  *techWorkerVTable = shifted;
  printf("%s\n", techWorker->topics().data());
}

At *techWorkerVTable = *poorObjAddr;, I assigned the VPtr address of poor (that is, the VTable address of class PoorPerson) to the VPtr of techWorker. When techWorker trys to invoke topics(), it actually invokes the method of class PoorPerson.

At void** shifted = poorVTableAddr-1; *techWorkerVTable = shifted;, I shifted the VTable’s address with an offset -1, and assigned the address to the VPtr of techWorker. As mentioned before that the address of VTable is actually the address of the first method. By this shifting, when techWorker tries to invoke the 2nd methods, but it actually invokes the 1st method.

Can you guess the output of the above program? This is what I got with compiler clang++ 18.1.3 (Ubuntu).

Paying House Debt
12345

[1] https://stackoverflow.com/a/1905871


If you have any feedback to this article, feel free to comment here or send an email to me

Prev: 甲子园伦理与做题家精神