This is the 21st day of my participation in the August More Text Challenge

This paper mainly analyzes the usage and underlying storage structure of protocol

Basic usage of the protocol

  • Syntax format: indicates the syntax format of the protocol
Protocol MyProtocol {//body}Copy the code
  • Class, struct, enumCan honor the agreement, if need beAdhere to multiple protocols, you can makeSeparated by commas
Struct CJLTeacher: Protocol1, Protocol2 {//body}Copy the code
  • If there is one in classsuperClass“, usually before the agreement to be followed
Struct CJLTeacher: NSObject, Protocol1, Protocol2 {//body}Copy the code

Add attributes to the protocol

  • You can add attributes to the protocol, but note a few things:
    • 1. The agreement requires oneAttribute must beClear isReadable/readable and writable

    • Attribute requirements are defined asVariable attributes, that is, usingvarRather than let

protocol CJLTest {
    var age: Int {get set}
}
Copy the code

Define methods in the protocol

  • To define a method in a protocol, you only need to define the name, argument list, and return value of the current method
    • Comply with the protocol in a concrete class and implement the methods in the protocol
protocol MyProtocol {
    func doSomething()
    static func teach()
}
class CJLTeacher: MyProtocol{
    func doSomething() {
        print("CJLTeacher doSomething")
    }
    
    static func teach() {
        print("teach")
    }
}
var t = CJLTeacher()
t.doSomething()
CJLTeacher.teach()
Copy the code
  • It can also be in the agreementDefine the initialization methodMust be used when implementing initializersrequiredThe keyword
protocol MyProtocol {
    init(age: Int)
}
class CJLTeacher: MyProtocol{
    var age: Int
    required init(age: Int) {
        self.age = age
    }
}
Copy the code
  • If an agreement can only beclassImplementation requires protocol inheritance fromAnyObject. If at this timeThe structure of the bodyAbiding by the agreement willAn error

Protocol progression – Use the protocol as a type

In addition to the basic usage above, there are several uses of protocols:

  • As a parameter type or return value in a function, method, or initializer

  • 2. The type used as a constant, variable, or attribute

  • As the type of an item in an array, dictionary, or other container

What is the print result of implementing the following code by inheriting from the base class? (Implemented by inheriting base classes)

class Shape{ var area: Double{ get{ return 0 } } } class Circle: Shape{ var radius: Double init(_ radius: Double) {self.radius = radius} override var area: Double{get{return radius * radius * 3.14}} class Rectangle: Shape{ var width, height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } override var area: Double{get{return width * height}}} var rectangle: Shape = circle. Shapes: [Shape] = [circle, Rectangle] for Shape in shapes{print(Shape. Area)} <! -- Print result --> 314.0 200.0Copy the code

In the case of arrays, the current size is fixed, because the current size isReference types(that is, accounts for8Byte), its storage structure is shown below

By protocol

  • The above code is implemented by inheriting the base class, i.eArea in the base class must have a default implementation, can also passagreementTo replace the way the current code is written
Protocol Shape{var area: Double {get}} class Circle: Shape{var radius: Double init(_ radius: Double) { self.radius = radius } var area: Double{get{return radius * radius * 3.14}}} class Rectangle: Shape{var width, height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{get{return width * height}}} var rectangle: Shape = circle. Shapes: [Shape] = [circle, Rectangle] for Shape in shapes{print(Shape. Area)} <! -- Print result --> 314.0 200.0Copy the code

When an array element’s Shape is a class, the array stores addresses of reference types. What is stored in the array if the Shape is a protocol?

    • If Shape protocol provides a default implementation, what is the print?
protocol Shape { } extension Shape{ var area: Double { get{return 0} } } class Circle: Shape{ var radius: Double init(_ radius: Double) { self.radius = radius } var area: Double{get{return radius * radius * 3.14}}} class Rectangle: Shape{var width, height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{get{return width * height}}} var circle: Shape = circle.init (10.0) print(circle.area) <! Print the result --> 0.0Copy the code

print0.0The reason is thatThe methods declared in Extension are static callsThat is, after compiling the link, the current address of Deamah has been determined. We areUnable to rewrite. This can be verified by SIL code

Protocol sample code analysis

Let’s examine SIL with a simple code

  • [Example 1] : What is the print result of the following code?
protocol MyProtocol { func teach() } extension MyProtocol{ func teach(){ print("MyProtocol") } } class MyClass: MyProtocol{ func teach(){ print("MyClass") } } let object: MyProtocol = MyClass() object.teach() let object1: MyClass = MyClass() object1.teach() <! -- Print result --> MyClass MyClassCopy the code

The same reason is because there is a declaration of the teach method in the MyProtocol protocol

  • To viewSILWhat is the difference between the two methods?
    • Witness_method (WITNess_method) is called through the PWT (protocol directory table) to get the function addresses. Internally, the method teach is called through the function table lookup

    • For object1, an object defined as type MyClass, the underlying call to teach is to find the function through the class’s function table, based primarily on the actual type of the class

The protocol directory table and function table are as followsView the protocolteachMethod concrete implementation of SIL code, inInternal callsisThe teach method in the MyClass function table

  • 【 Example 2】: If removedMyProtocolWhat is the print result of the declaration of the teach method in the protocol?
// How about removing the declaration from the protocol? Protocol MyProtocol{} extension MyProtocol{func teach(){print("MyProtocol")}} class MyClass: MyProtocol{ func teach(){ print("MyClass") } } let object: MyProtocol = MyClass() object.teach() let object1: MyClass = MyClass() object1.teach() <! -- Print result --> MyProtocol MyClassCopy the code

The root cause of the print inconsistency is that the teach method implemented in the MyProtocol extension cannot be overridden by the class, which is equivalent to two methods, not the same

  • Look at the underlying SIL code
    • Print firstMyProtocol, because the protocol extension teach method is called, whose address is determined at compile time, i.eAddress scheduling via static functions
    • Second printMyClass, as in the previous example, is a function table call of the class

View the SILwitness_table, there is no teach method– a statement inProtocolThe underlying method is stored inPWT, the method in PWT also passesclass_methodTo go toType V - TableTo find the corresponding method of scheduling. – If there is no statement inProtocolThe function in, just goes throughExtensionProvides a default implementation whose function addresses are determined at compile time, which cannot be overridden for protocol-compliant classes

PWT storage location of the protocol

When we analyze function scheduling, we already know that v-table is stored in metadata, so where is the protocol PWT stored?

  • What is the printed result of the following code?
protocol Shape { var area: Double {get} } class Circle: Shape{ var radius: Double init(_ radius: Double) {self.radius = radius} var area: Double{get{return radius * 3.14}} var circle: Shape = Circle(10.0) print(memorylayout. size(ofValue: Circle)) print(memorylayout. stride(ofValue: Circle)) var circle1: Circle = Circle(10.0) print(memoryLayout. size(ofValue: circle1)) print(memorylayout. stride(ofValue: circle1)) <! -- Print the result --> 40 40 8 8Copy the code
  • First of all bylldbDebugging is as follows

  • Look at the corresponding SIL code, one more step than usualinit_existential_addr, can be interpreted as: use containsCircletheexistential containerTo initialize thecircleReferenced memory. In layman’s terms, willcirclePacked and storedexistential containerInitialized memory

Where, the SIL official document pairsinit_existential_addrThe explanation is as followsOne of theexistential containerIs a special data type generated by the compiler and also used to manage protocol types that adhere to the same protocol. Because these data types have different sizes of memory space, useexistential containerManagement implements storage consistency

  • throughIRThe code is analyzed as follows
define i32 @main(i32 %0, i8** %1) #0 { entry: %2 = bitcast i8** %1 to i8* ; S4main6CircleCMa is equivalent to type metadata accessor for main.Circle %3 = Call swiftcc % swift.metadata_Response @"$s4main6CircleCMa"(i64 0) #7 %4 = extractvalue %swift.metadata_response %3, 0 ; S4main6CircleCyACSdcfC is equivalent to main.circle. __allocating_init(swif.double) -> main.Circle %5 = Call Swiftcc %T4main6CircleC* $s4main6CircleCyACSdcfC (double + 1, %swift. Type * swiftself %4); To store in a memory; I32 0, i32 1 structure is not offset, and select the second field, ==> type {[24 x i8], metadata, i8**} store %swift.type* %4, %swift.type** getelementptr inbounds (%T4main5ShapeP, %T4main5ShapeP* @"$s4main6circleAA5Shape_pvp", i32 0, i32 1), align 8 ; S4main6CircleCAA5ShapeAAWP equivalent to the protocol witness table for the main. Circle: Main. Shape in main protocol directory table, Put it into i8** of T4main5ShapeP structure ==> type {[24 x i8], metadata, PWT} store i8** getelementptr inbounds ([2 x i8*], [2 x i8*]* @"$s4main6CircleCAA5ShapeAAWP", i32 0, i32 0), i8*** getelementptr inbounds (%T4main5ShapeP, %T4main5ShapeP* @"$s4main6circleAA5Shape_pvp", i32 0, i32 2), align 8 ; S4main6circleAA5Shape_pvp is equivalent to main.circle: Main. Shape, put %5 into %T4main6CircleC** counted, type <{% swif.refclec counted, %TSd}> ==> type {HeapObject, metadata, PWT}; Put the %T4main6CircleC* %5 instance object address into the %T4main6CircleC** secondary pointer, which means that the instance object takes up 8 bytes, So if you put it in a structure it's going to take 8 bytes store %T4main6CircleC* %5, %T4main6CircleC** bitcast (%T4main5ShapeP* @"$s4main6circleAA5Shape_pvp" to %T4main6CircleC**), align 8 .....Copy the code

Verse structure

Then through the above analysis, write the whole internal structure of imitation

<! Struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 var refCount2: struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 UInt32} // %T4main5ShapeP = type {[24 x i8], %swift.type*, I8 **} struct protocolData {//24 * i8: Var value1: UnsafeRawPointer var value2: UnsafeRawPointer var value3: UnsafeRawPointer //type Stores metadata for finding Value Witness Table. Var type: UnsafeRawPointer // i8* Stores PWT. UnsafeRawPointer } <! --> protocol Shape{var area: Double {get}} class Circle: Shape{var radius: Double init(_ radius: Double) {self.radius = radius} var area: Double{get{return radius * 3.14}} Shape = Circle (10.0) <! --> withUnsafePointer(to: &circle) {PTR in PTR. WithMemoryRebound (to: protocolData.self, capacity: 1) { pointer in print(pointer.pointee) } } <! --> protocolData(ValuE1:0x0000000100550100, ValuE2:0x0000000000000000, ValuE3:0x0000000000000000, Type: 0x0000000100008180, pwt: 0x0000000100004028)Copy the code

LLDB debugging is as followsvalue1isHeapObject.typeismetadata 0x0000000100004028Can be achieved bynm + xcrunTo verify that it isPWT

What if I change class to struct?

  • If one of theClass into a Struct? As shown below.
protocol Shape { var area: Double {get} } struct Rectangle: Shape{ var width, height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{get{return width * height}}} Struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 var refCount2: UInt32} // %T4main5ShapeP = type {[24 x i8], %swift.type*, I8 **} struct protocolData {//24 * i8: Var value1: UnsafeRawPointer var value2: UnsafeRawPointer var value3: UnsafeRawPointer //type Stores metadata for finding Value Witness Table. Var type: UnsafeRawPointer // i8* Stores PWT. UnsafeRawPointer} // Convert circle to protocolData withUnsafePointer(to: & Rectangle) {PTR in PTR. WithMemoryRebound (to: protocolData.self, capacity: 1) { pointer in print(pointer.pointee) } } <! -> protocolData(Value1:0x4024000000000000, Value2:0x4034000000000000, Value3:0x0000000000000000, Type: 0x0000000100004098, pwt: 0x0000000100004028)Copy the code

LLDB debugging for printed results is as follows,value1storage10.value2storage20

  • Check its IR code
define i32 @main(i32 %0, i8** %1) #0 { entry: %2 = bitcast i8** %1 to i8* ; %3 = call swiftcc {double, double} @"$rectanglevyacsd_sdtcfc "(double + 1) Double + 1 = extractValue {double, double} %3, 0 %5 = ExtractValue {double, double} %3, 1; The pointer type is <{i8**, i64, <{i32, i32, i32, i32, i32, i32, i32}>*; <{i8**, i64, <{i32, i32, i32, i32, i32, i32, i32}> 1 Select the second field of the structure; Type stored in the structure, Metadata store %swift. Type * bitcast (i64* getelementptr inbounds (<{i8**, i64, <{i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }>*, i32, i32 }>, <{ i8**, i64, <{ i32, i32, i32, i32, i32, i32, i32 }>*, i32, i32 }>* @"$s4main9RectangleVMf", i32 0, i32 1) to %swift.type*), %swift.type** getelementptr inbounds (%T4main5ShapeP, %T4main5ShapeP* @"$s4main9rectangleAA5Shape_pvp", i32 0, i32 1), align 8 ; Using s4main9RectangleVAA5ShapeAAWP structure to store the store i8 * * getelementptr inbounds ([2] i8 * x, [2 x i8*]* @"$s4main9RectangleVAA5ShapeAAWP", i32 0, i32 0), i8*** getelementptr inbounds (%T4main5ShapeP, %T4main5ShapeP* @"$s4main9rectangleAA5Shape_pvp", i32 0, i32 2), align 8 ; Insert double into memory, offset by %4 and %5, offset by 0 and 1 respectively, Store double %4, double* getelementptr inbounds (%T4main9RectangleV, rectanglev) %T4main9RectangleV* bitcast (%T4main5ShapeP* @"$s4main9rectangleAA5Shape_pvp" to %T4main9RectangleV*), i32 0, i32 0, i32 0), align 8 store double %5, double* getelementptr inbounds (%T4main9RectangleV, %T4main9RectangleV* bitcast (%T4main5ShapeP* @"$s4main9rectangleAA5Shape_pvp" to %T4main9RectangleV*), i32 0, i32 1, i32 0), align 8 ......Copy the code

What if you have three properties in the struct?

  • What if struct has three struct attributes
Struct Rectangle: Shape{Rectangle width, height: Double Double) { self.width = width self.height = height } var area: Double{ get{ return width * height } } } <! -> protocolData(Value1:0x4024000000000000, Value2:0x4034000000000000, Value3:0x403E000000000000, Type: 0x0000000100004098, pwt: 0x0000000100004028)Copy the code

As you can see from the result, it is stored in Value3

What if there are four attributes in the struct?

Struct Rectangle: Shape{Rectangle width, height: Double Rectangle width, height: Double Double) { self.width = width self.height = height } var area: Double{ get{ return width * height } } } <! -> protocolData(ValuE1:0x0000000100546a50, Value2:0x0000000000000000, ValuE3:0x0000000000000000, Type: 0x00000001000040c0, pwt: 0x0000000100004050)Copy the code

Among themvalue1Is aThe heap area addressThe heap address stores the values of four attributes

Summary of the protocol underlying storage structureTherefore, the underlying storage structure of the protocol is shown in the figure below:

  • The first 24 bytes are used to store the class/struct attribute values that follow the protocol. If 24 bytes are not enough, a memory space is allocated in the heap for storage. The first 8 bytes of the 24 bytes are used to store the heap address. Then it finds it’s not enough to redistribute heap space.)

  • The last 16 bytes are used to store VWT and PWT respectively.

Continue to analyze

In the following example, the for-in loop can distinguish between different areas mainly because of the protocol PWT, which is also searched by class_method, and stores metadata during execution. Therefore, you can find the corresponding V-table according to metadata to complete the method invocation

Protocol Shape{var area: Double {get}} class Circle: Shape{var radius: Double init(_ radius: Double) {self. Radius = radius} var area: Double{get{return radius * radius * 3.14}} class Rectangle: Shape{ var width, height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{ get{ return width * height } } } var circle: Shape = Circle. Init (10.0) var Rectangle: Shape = rectangle. Init (10.0, 20.0) [Shape] = [circle, rectangle] // Rectangle is a rectangle, and rectangle is a rectangle. Shape {print(shap.area)} For shape in shapes{print(shap.area)}Copy the code
  • Continue to backstruct“, assign it to another variable, does it hold the same memory?
protocol Shape { var area: Double {get} } struct Rectangle: Shape{ var width, height: Double var width1 = 0 var height1 = 0 init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{get{return width * height}}} Rectangle = rectangle1; Rectangle = rectangle1; Rectangle = Rectangle <! Struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 var refCount2: struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32} // %T4main5ShapeP = type {[24 x i8], %swift.type*, I8 **} struct protocolData {//24 * i8: Var value1: UnsafeRawPointer var value2: UnsafeRawPointer var value3: UnsafeRawPointer //type Stores metadata for finding Value Witness Table. Var type: UnsafeRawPointer // i8* Stores PWT. UnsafeRawPointer} // Convert circle to protocolData withUnsafePointer(to: & Rectangle) {PTR in PTR. WithMemoryRebound (to: protocolData.self, capacity: 1) { pointer in print(pointer.pointee) } }Copy the code

The print is as follows, with two protocol variablesMemory holds the same thingthe

  • ifModifies the width property of Rectangle1(You need to declare the width property to protocol), the modified code looks like this
protocol Shape { var width: Double {get set} var area: Double {get} } struct Rectangle: Shape{ var width: Double // var width, height: Double var height: Double init(_ width: Double, _ height: Double) { self.width = width self.height = height } var area: Double{get{return width * height}}} Shape = a Rectangle (10.0, 20.0) / / to be assigned to another protocol variables var rectangle1: Shape = a Rectangle / / view its memory structure struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 var refCount2: UInt32} // %T4main5ShapeP = type {[24 x i8], %swift.type*, I8 **} struct protocolData {//24 * i8: Var value1: UnsafeRawPointer var value2: UnsafeRawPointer var value3: UnsafeRawPointer //type Stores metadata for finding Value Witness Table. Var type: UnsafeRawPointer // i8* Stores PWT. UnsafeRawPointer} // Convert circle to protocolData withUnsafePointer(to: & Rectangle) {PTR in PTR. WithMemoryRebound (to: protocolData.self, capacity: 1) { pointer in print(pointer.pointee) } } withUnsafePointer(to: &rectangle1) { ptr in ptr.withMemoryRebound(to: protocolData.self, capacity: 1) {pointer in print(pointer.pointee)}} rectangle1.width = 50.0Copy the code

Through LLDB debugging found inrectangle1Variable changeswidthAfter that, the heap address where the data is stored changes. That’s what it’s calledWhen writing copyWhen copying, there is no change in value, so the two variables refer to the same heap memory. When the second variable changes its property value, the original heap memory value is copied to a new heap memory and the value is changed

Question 1: If you change struct to class, is it also copy on write?

In the example above, if the protocol is followed by a class (struct instead of class), is it also copied on write?

class Rectangle: Shape{
    var width: Double
//    var width, height: Double
    var height: Double
    init(_ width: Double, _ height: Double) {
        self.width = width
        self.height = height
    }

    var area: Double{
        get{
            return width * height
        }
    }
}
Copy the code

The LLDB debugging result is as follows. The heap address does not change after the attribute value is modified, which is consistent with the understanding of value type and reference type

  • Value types do not share state during transmission

  • Reference types share state during transit

Question: If the number exceeds 24 bytes, do I store it in Value1 and find that it is not enough to reallocate the heap, or do I allocate it directly?

As shown below, four attributes are defined in the struct

protocol Shape { var area: Double {get} } class Rectangle: Shape{ var width: Double var height: Double var width1: Double var height1: Double init(_ width: Double, _ height: Double, _ width1: Double, _ height1: Double) { self.width = width self.height = height self.width1 = width1 self.height1 = height1 } var area: Double{get{return width * height}}} var rectangle: Shape = Rectangle(2.0, 2.0)Copy the code
  • Look at the IR code, and you can see from the code, yesThe heap space is allocated and the attribute values are stored in the heap space

Question 3: What if the stored value type is String?

As shown below, the stored value type is String. Check the underlying storage

protocol Shape { var area: Double {get} } struct Rectangle: Shape{ var height: String init(_ height: String) { self.height = height } var area: Double{ get{ return 0 } } } var rectangle: Struct HeapObject {var type: UnsafeRawPointer var refCount1: UInt32 var refCount2: Rectangle = Rectangle("CJL") UInt32} // %T4main5ShapeP = type {[24 x i8], %swift.type*, I8 **} struct protocolData {//24 * i8: Var value1: UnsafeRawPointer var value2: UnsafeRawPointer var value3: UnsafeRawPointer //type Stores metadata for finding Value Witness Table. Var type: UnsafeRawPointer // i8* Stores PWT. UnsafeRawPointer} // Convert circle to protocolData withUnsafePointer(to: & Rectangle) {PTR in PTR. WithMemoryRebound (to: protocolData.self, capacity: 1) { pointer in print(pointer.pointee) } }Copy the code
  • Check its IR code

  • LLDB debugging is as follows, as is the bottom layerStore by value

conclusion

The underlying storage structure of the protocol is as follows:

  • The first 24 bytes, officially called a Value Buffer, are used to store the current Value

  • If the Value Buffer exceeds the maximum size (24 bytes)

    • Value types The copy – write, that is, the copy copy content as a whole, when the value changed, will first check the reference count, if the reference count is greater than 1, will open up new heap memory space, and then modify the values into the new space, its purpose is to improve the utilization rate of memory and memory consumption, reduce heap area so as to realize the performance of ascension

    • Reference types use the same heap address because the copied variable shares state with the original variable

conclusion

  • Class, struct, and enum all comply with the protocol.

    • 1. Use commas (,) to separate protocols

    • 2. If there is a superClass in the class, it usually precedes the protocol

  • You can add attributes to a protocol as follows:

    • 1. Attributes must be explicitly readable (GET)/read-write (get + set)

    • 2. Attributes are modified with var

  • Methods can be defined in the protocol by defining the name of the current method + the parameter list + the return value, which can be implemented through the extension of the protocol or when the protocol is adhered to

  • Initialization methods can also be defined in the protocol, and the required keyword must be used when the initializer is implemented

  • If the protocol can only be implemented by class, the protocol must inherit from AnyObject

  • Protocols can also be used as types in the following scenarios:

    • As a parameter type or return value in a function, method, or initializer

    • 2. The type used as a constant, variable, or attribute

    • As the type of an item in an array, dictionary, or other container

  • The underlying storage structure of the protocol: 24 bytes valueBuffer + VWT (8 bytes) + PWT (8 bytes)

    • 1. The first 24 bytes, officially called a Value Buffer, are used to store the values of protocol-compliant class/struct attributes

    • 2. If the maximum size of the Value Buffer is exceeded

      • (1) The value type is copy-write

      • (2) The reference type uses the same heap address

    • 3. The last 16 bytes are used to store VWT and PWT respectively.