This is the next part, which mainly discusses the implementation mechanism of Swift Protocol.

The content covers Type Metadata, Protocol Memory model Container, the implementation principle of Generics, and generic specialization.

This post is also published on my personal blog

Type Metadata


In Objective-C, MetaClass is expressed through the MetaClass model, and MetaClass is referenced through the ISA pointer to the instance. This is the core mechanic of the entire Objective-C Runtime.

How is Type Metadata expressed in Swift?

Swift Runtime generates a Metadata Record for each type (Class, Struct, Enum, Protocol, Tuple, Function, etc.).

  • For nominal types (e.g., classes, structures, enumerations), the corresponding Metadata Record is statically generated by the compiler at compile time.

  • For intrinsic types and generic instances, Metadata records, such as tuples, functions, and protocols, are generated dynamically at runtime.

  • Each type of Metadata Record is unique. The same type of Metadata Record is the same

Different types of Metadata contain different information layouts, but they have a common header that contains:

  • VWT (Value Witness Table) Poniter – Pointer to VWT (virtual function Table), VWT contains basic operations (function Pointers) about how instances of this type allocate memory (allocating), copy (copying), destroy (destroying), etc.

    In addition to the function pointer mentioned above, VWT also contains the basic information of the instance of this type, such as size, alignment, stride, etc.

  • Kind – Marks the type of Metadata, such as 0 — Class, 1 — Struct, 2 — Enum, 12 — Protocol, and so on.

Related to this discussion is VWT. For more information about Metadata, see swift/TypeMetadata. RST at main · apple/swift · GitHub.

Inside Protocol


Existential Container

Instances of classes, structs, and enUms have defined “models” that guide their memory layout.

A “model” is a definition of Class, Struct, and Enum itself, the members they contain.

Protocol has no definitive “model” because the real type behind it can be very strange. What is the memory layout of variables of Protocol type?

Swift uses a model called An Interface Container to guide the memory layout of Protocol variables.

Such Containers fall into two categories:

  • Opaque Container – For Protocol (No class constraint on Protocol), that is, the real types behind such a Protocol could be classes, constructs, enumerations, and so on. So storage is very complicated.

    struct OpaqueExistentialContainer {
      void *fixedSizeBuffer[3];
      Metadata *type;
      WitnessTable *witnessTables[NUM_WITNESS_TABLES];
    };
    Copy the code

    As above, OpaqueExistentialContainer consists of three members:

    • FixedSizeBuffer – a 3 pointer size buffer space that stores its contents directly in fixedSizeBuffer when the size of the real type (memory aligned size) is less than 3 words, otherwise it is stored in a separate heap. And store Pointers in fixedSizeBuffer;

    • Type – Metadata that refers to the real type and, most importantly, refers to the VWT used to perform various operations on memory;

    • WitnessTables – Points to the Protocol Witness Table (PWT). The Protocol Function Table stores the addresses of the corresponding functions of the real type.

  • Class Container – For a class-constrained Protocol, the real type behind which can only be classes, instances of which are allocated memory on the Heap.

    Thus, in an Interface Container, only a pointer to the heap memory is required.

    Also, since an instance of the class contains Pointers to its Metadata, there is no need to store a pointer to Metadata in an Interface Container:

    struct ClassExistentialContainer {
      HeapObject *value;
      WitnessTable *witnessTables[NUM_WITNESS_TABLES];
    };
    Copy the code

    As above, ClassExistentialContainer contains only two members:

    • Value – pointer to heap memory;

    • WitnessTables – PWT pointer.

Here’s an example:

As shown in figure, because the protocol Drawable class no constraint, so their corresponding Existential Container is OpaqueExistentialContainer:

  • Because Point instance takes two word memory space (< 3), reason for Drawable protocol type variable Point directly using OpaqueExistentialContainer# buffer to store the content (x, y);

  • An instance of Line takes up four words of memory, so Line needs to allocate memory on the heap for its contents (x0, y0, X1, y1);

  • The Types in an Interface Container refer to the Metadata of the real type behind them;

  • The function pointer in PWT points to a function of the real type.

The (pseudo) code generated by the corresponding compiler is as follows:

let point: Drawable = Point(x: 0, y: 0)
point.draw()

// Compiler generated (pseudo) code
let _point: Point = Point(x: 0, y: 0)
var point: OpaqueExistentialContainer = OpaqueExistentialContainer(a)let metadata = _point.type
let vwt = metadata.vwt
vwt.copy(&(point.fixedSizeBuffer), _point)
point.type = metadata
point.witnessTables = PWT(Point.draw)
point.pwt.draw()
vwt.dealloc(point)
Copy the code
let line: Drawable = Line(x0: 0, y0: 0, x1: 1, y1: 1)
line.draw()

// Compiler generated (pseudo) code
let _line: Line = Line(x0: 0, y0: 0, x1: 1, y1: 1)
varThe line:OpaqueExistentialContainer = OpaqueExistentialContainer(a)let metadata = _line.type
let vwt = metadata.vwt
line.fixedSizeBuffer[0] = vwt.allocate()
vwt.copy(line.fixedSizeBuffer, _line)
line.type = metadata
line.witnessTables = PWT(Line.draw)
line.pwt.draw()
vwt.dealloc(line)
Copy the code

For more information about TypeLayout, see swift/ typelayout. RST at main · Apple/Swift · GitHub

As you can see from the pseudocode above, the compiler does a lot of work behind the scenes for protocol-type variables. There is also a certain performance loss.

Protocol Type Stored Properties

From the previous section, the protocol type of variable is the essence of which type Existential Container (OpaqueExistentialContainer/ClassExistentialContainer).

Thus, when a protocol type variable is used as a storage attribute, its memory footprint in the host instance is an Instance of An Interface Container (figure below from Understanding Swift Performance · WWDC2016) :

summary

Protocol differs from common types (Class, Struct, Enum) with the following characteristics:

  • Using the Existential Container (OpaqueExistentialContainer/ClassExistentialContainer) as the memory model;

  • Instances with <= 3 memory usage are stored directly in an Interface Container Buffer; otherwise, memory is allocated on the heap and the pointer is stored in an Interface Container Buffer [0];

  • Memory management (allocating, copying, destroying) related methods are stored in VWT;

  • Dynamic dispatch is implemented through PWT.

Generics


Generics are supported by most languages, such as Swift, Java, and C++ (templates), as an important means of improving code flexibility and reusability.

Swift combines Protocol to give generics more flexibility.

Let’s take a quick look at how generics are implemented in Swift.

In Swift generics can add Type Constraints, which can be classes or protocols.

Thus, Swift generics can be divided into three classes based on type constraints:

  • No Constraints

  • Class Constraints

  • Protocol Constraints

The following is a brief analysis of these three types of situations.

SIL (SWIFT/sil.rst at Main · Apple/Swift · GitHub) provides an overview of the implementation principles behind SWIFT.

swiftc demo.swift -O -emit-sil -o demo-sil.s

As shown above, you can use the swiftc command to generate SIL.

-o is used to compile and optimize the generated SIL code to make SIL cleaner and more efficient.

Specialization of Generics, as discussed later, can only happen with -O optimization.

No Constraints

In fact, such generics can perform very few operations. Objects cannot be instantiated (created) on the heap, nor can any methods be executed.

@inline(never)  // Disable the compiler from inline optimization
func swapTwoValues<T> (_ a: inout T._ b: inout T) {
    let temp = a
    a = b
    b = temp
}
Copy the code

As above, swapTwoValues is used to exchange the values of two variables without any constraints on its generic parameter T.

The corresponding SIL is as follows. Key points:

  • Line 8, alloc_stack allocates memory on the stack for type T (temp);

  • Line 9, memory-level copy via copy_ADDR (no init methods are executed);

  • In line 12, the dealloc_stack is used to destroy the memory.

  • The other is to do some memory copy operations.

Note that for reference types, $T is a pointer, that is, the memory on the stack to store the pointer, not the reference type itself.

1   // swapTwoValues<A>(_:_:)
2   sil hidden [noinline] @$s4main13swapTwoValuesyyxz_xztlF : $@convention(thin) <T> (@inout T.@inout T) - > () {3   // %0 "a" // users: %5, %6, %2
4   // %1 "b" // users: %7, %6, %3
5   bb0(%0 : $*T.%1 : $*T) :6    debug_value_addr %0 : $*T.var, name "a", argno 1 // id: %2
7    debug_value_addr %1 : $*T.var, name "b", argno 2 // id: %3
8    %4 = alloc_stack $T.let, name "temp"           // users: %8, %7, %5
9    copy_addr %0 to [initialization] %4 : $*T       // id: %5
10   copy_addr [take] %1 to %0 : $*T                 // id: %6
11   copy_addr [take] %4 to [initialization] %1 : $*T // id: %7
12   dealloc_stack %4 : $*T                          // id: %8
13   %9 = tuple ()                                   // user: %10
14   return %9 : $()                                 // id: %10
15 } // end sil function '$s4main13swapTwoValuesyyxz_xztlF'
Copy the code

Class Constraints

class Shape {
  required
  init(a) {}

  func draw(a) -> Bool {
    return true}}class Triangle: Shape {
  required
  init(a) {}

  override
  func draw(a) -> Bool {
    return true}}@inline(never)
func drawShape<T: Shape> (_ s: T) {
  let s0 = T()
  s0.draw()
}
Copy the code

Such as:

  • Cast the generic type upcast to the constraint type (lines 5-6);

  • Find the method to execute using the class_method directive (init, draw methods [lines 7-10], in vtable);

  • Instances of generic types can be created on the Heap (line 8);

  • Its essence is the realization of polymorphism through virtual function table.

In summary, with the base class as a type constraint, all methods exposed by the base class can be executed through the virtual function table.

1  // drawShape<A>(_:)
2  sil hidden [noinline] @$s4main9drawShapeyyxAA0C0CRbzlF : $@convention(thin) <T where T : Shape> (@guaranteed T) - > () {3  // %0 "s"
4  bb0(%0 : $T) :5    %1 = metatype $@thick T.Type                    // user: %2
6    %2 = upcast %1 : $@thick T.Type to $@thick Shape.Type // users: %4, %3
7    %3 = class_method %2 : $@thick Shape.Type#,Shape.init!allocator : (Shape.Type) - > () - >Shape, $@convention(method) (@thick Shape.Type) - >@owned Shape // user: %4
8    %4 = apply %3(%2) : $@convention(method) (@thick Shape.Type) - >@owned Shape // users: %7, %5, %6
9    %5 = class_method %4 : $Shape#,Shape.draw : (Shape) - > () - >Bool, $@convention(method) (@guaranteed Shape) - >Bool // user: %6
10   %6 = apply %5(%4) : $@convention(method) (@guaranteed Shape) - >Bool
11   strong_release %4 : $Shape                      // id: %7
12   %8 = tuple ()                                   // user: %9
13   return %8 : $()                                 // id: %9
14 } // end sil function '$s4main9drawShapeyyxAA0C0CRbzlF'
15
16 sil_vtable Shape {
17   #Shape.init!allocator: (Shape.Type) - > () - >Shape : @$s4main5ShapeCACycfC    // Shape.__allocating_init()
18   #Shape.draw: (Shape) - > () - >Bool : @$s4main5ShapeC4drawSbyF    // Shape.draw()
19   #Shape.deinit!deallocator: @$s4main5ShapeCfD    // Shape.__deallocating_deinit
20 }
21
22 sil_vtable Triangle {
23   #Shape.init!allocator: (Shape.Type) - > () - >Shape : @$s4main8TriangleCACycfC [override]    // Triangle.__allocating_init()
24   #Shape.draw: (Shape) - > () - >Bool : @$s4main8TriangleC4drawSbyF [override]    // Triangle.draw()
25   #Triangle.deinit!deallocator: @$s4main8TriangleCfD    // Triangle.__deallocating_deinit
26 }
Copy the code

Protocol Constraints

The Protocol discussed here does not have class Constraints, and when used as generic Constraints for protocols that can only be implemented by classes, it has the same effect as class Constraints discussed in the previous section.

@inline(never)
func equal<T: Equatable> (_ a: T._ b: T) -> Bool {
  let a0 = a
  let b0 = b
  return a0 = = b
}
Copy the code

It can be seen from the following SIL:

  • Alloc_stack allows you to allocate memory on the Stack for generic types (whether they are value types or reference types);

  • Also perform memory copy via copy_ADDR;

  • Witness_method directive to look up the methods specified by Protocol on the generic type (look up the PWT table).

// equal<A>(_:_:)
sil hidden [noinline] @$s4main5equalySbx_xtSQRzlF : $@convention(thin) <T where T : Equatable> (@in_guaranteed T.@in_guaranteed T) - >Bool {
// %0 "a" // users: %5, %2
// %1 "b" // users: %8, %3
bb0(%0 : $*T.%1 : $*T):
  debug_value_addr %0 : $*T.let, name "a", argno 1 // id: %2
  debug_value_addr %1 : $*T.let, name "b", argno 2 // id: %3
  %4 = alloc_stack $T.let, name "a0"             // users: %9, %10, %8, %5
  copy_addr %0 to [initialization] %4 : $*T       // id: %5
  %6 = metatype $@thick T.Type                    // user: %8
  %7 = witness_method $T#,Equatable."= =" : <Self where Self : Equatable> (Self.Type) - > (Self.Self) - >Bool : $@convention(witness_method: Equatable) <Tau _0_0whereTau _0_0:Equatable> (@in_guaranteedTau _0_0,@in_guaranteedTau _0_0,@thickTau _0_0.Type) - >Bool // user: %8
  %8 = apply %7<T>(%4.%1.%6) : $@convention(witness_method: Equatable) <Tau _0_0whereTau _0_0:Equatable> (@in_guaranteedTau _0_0,@in_guaranteedTau _0_0,@thickTau _0_0.Type) - >Bool // user: %11
  destroy_addr %4 : $*T                           // id: %9
  dealloc_stack %4 : $*T                          // id: %10
  return %8 : $Bool                               // id: %11
} // end sil function '$s4main5equalySbx_xtSQRzlF'
Copy the code

Here’s another example:

protocol Drawable {
  init(a)
  func draw(a) -> Bool
}

@inline(never)
func drawShape<T: Drawable> (_ s: T) -> T {
  var s0 = T()
  s0.draw()
  return s0
}
Copy the code

As can be seen from the following SIL:

  • Instances of generic types (either value types or reference types) can be created on the Heap;
// drawShape<A>(_:)
sil hidden [noinline] @$s4main9drawShapeyxxAA8DrawableRzlF : $@convention(thin) <T where T : Drawable> (@in_guaranteed T) - >@out T {
// %0 "$return_value" // users: %4, %6
// %1 "s"
bb0(%0 : $*T.%1 : $*T) :%2 = metatype $@thick T.Type                    // user: %4
  %3 = witness_method $T#,Drawable.init!allocator : <Self where Self : Drawable> (Self.Type) - > () - >Self : $@convention(witness_method: Drawable) <Tau _0_0whereTau _0_0:Drawable> (@thickTau _0_0.Type) - >@outTau _0_0// user: %4
  %4 = apply %3<T>(%0.%2) : $@convention(witness_method: Drawable) <Tau _0_0whereTau _0_0:Drawable> (@thickTau _0_0.Type) - >@outTau _0_0%5 = witness_method $T#,Drawable.draw : <Self where Self : Drawable> (Self) - > () - >Bool : $@convention(witness_method: Drawable) <Tau _0_0whereTau _0_0:Drawable> (@in_guaranteedTau _0_0) - >Bool // user: %6
  %6 = apply %5<T>(%0) : $@convention(witness_method: Drawable) <Tau _0_0whereTau _0_0:Drawable> (@in_guaranteedTau _0_0) - >Bool
  %7 = tuple ()                                   // user: %8
  return %7 : $()                                 // id: %8
} // end sil function '$s4main9drawShapeyxxAA8DrawableRzlF'


sil_witness_table hidden Shape: Drawable module main {
  method #Drawable.init!allocator: <Self where Self : Drawable> (Self.Type) - > () - >Self : @$s4main5ShapeCAA8DrawableA2aDPxycfCTW    // protocol witness for Drawable.init() in conformance Shape
  method #Drawable.draw: <Self where Self : Drawable> (Self) - > () - >Bool : @$s4main5ShapeCAA8DrawableA2aDP4drawSbyFTW    // protocol witness for Drawable.draw() in conformance Shape
}
Copy the code

This simple analysis shows the implementation differences between generic types for No Constraints, Class Constraints, and Protocol Constraints:

  • No Constraints does very little. You can’t execute any methods. You can only allocate memory for generic types on the Stack and perform memory copies.

  • Class Constraints generics can create new instances on the Heap, and method calls are implemented through the virtual function table (VTable);

  • Protocol Constraints generics can create instances of generic types on Stack or Heap as needed, whether they are value or reference types;

  • Protocol Constraints generics implement method calls through PWT;

  • That is, method calls in generics are dispatched dynamically, either through vtable or PWT.

Specialization of Generics

As you can see from the previous section, generic method calls are dispatched dynamically (via VTABLE or PWT) with a performance cost.

To optimize for such losses, the Swift compiler specializes of Generics on Generics.

Specialization is the generation of the corresponding version of a function for a specific type, thus turning generic types into non-generic types and achieving static distribution of method calls.

@inline(never)
func swapTwoValues<T> (_ a: inout T._ b: inout T) {
  let temp = a
  a = b
  b = temp
}

var a = 1
var b = 2
swapTwoValues(&a, &b)
Copy the code

For example, when swapTwoValues is called with an Int argument, the compiler generates an Int version of the method:

// specialized swapTwoValues<A>(_:_:)
sil shared [noinline] @$s4main13swapTwoValuesyyxz_xztlFSi_Tg5 : $@convention(thin) (@inout Int.@inout Int) - > () {// %0 "a" // users: %6, %4, %2
// %1 "b" // users: %7, %5, %3
bb0(%0 : $*Int.%1 : $*Int):
  debug_value_addr %0 : $*Int.var, name "a", argno 1 // id: %2
  debug_value_addr %1 : $*Int.var, name "b", argno 2 // id: %3
  %4 = load %0 : $*Int                            // user: %7
  %5 = load %1 : $*Int                            // user: %6
  store %5 to %0 : $*Int                          // id: %6
  store %4 to %1 : $*Int                          // id: %7
  %8 = tuple ()                                   // user: %9
  return %8 : $()                                 // id: %9
} // end sil function '$s4main13swapTwoValuesyyxz_xztlFSi_Tg5'
Copy the code

So, when does generic specialization happen?

The general rule is to know which callers are available when compiling generic methods, and that the types of callers are deductible.

In the simplest case, the generic method and the caller are compiled together in the same source file.

In addition, if whole-Module Optimization is turned on at compile time, generic calls within the same Module can also be specialized.

Please refer to swift.org-whole-module Optimization in Swift 3 for full Module Optimization, which will not be described here.

summary

  • Swift generates a Metadata Record for each type, including the Value Witness Table (VWT).

  • Protocol uses An Interface Container as its memory model, and all variables of Protocol type are instances of An Interface Container.

  • Protocol is dynamically dispatched by PWT implementation method.

  • Generic calls are specialized when certain conditions are met to improve performance.

The resources

Swift-evolution · Opaque Result Types

OpaqueTypes

Different flavors of type erasure in Swift

Opaque Return Types and Type Erasure

Phantom types in Swift

How to use phantom types in Swift

Swift /TypeMetadata. RST at main · apple/swift · GitHub

Swift/typelayout. RST at main · apple/swift · GitHub

Swift Type Metadata

Understanding Swift Performance · WWDC2016

Swift.org – Whole-Module Optimization in Swift 3