Definition of closure

Let’s first take a look at the definition of Swift official document:

Closures are independent blocks of code that can be passed around and used in your code at will. Closures in Swift are similar to blocks in Objective-C/C, and anonymous functions in other programming languages.

Closures can capture and store any variable from the context of the code that defines them. This is also called when these variables and constants are temporarily turned off for use. And Swift handles the memory you capture for management.

As we saw that closures are similar to anonymous functions, and that closures add the ability to capture and store any variable in the context of your code, we explored the bottom line by exploring the closure’s capture capabilities.

Closure capture

Let’s start with two simple pieces of code:

var age = 18
let printAge = {
    print(age)
}
age + = 1
printAge() / / 19
Copy the code

It is obvious here that the closure captures the age variable, and even if the age variable changes, the closure will still type the correct value.

var age = 18
let printAge = {
    [age] in
    print(age)
}
age + = 1
printAge() / / 18
Copy the code

This code has [age] in, and the rest is called the Closure Capture List. So why does the age value print the same after the closure capture list?

It looks like the closure catches a reference to age, while the closure catches a copy of age. Let’s dig a little deeper

Closure capture list (closure capture list)

Let’s start by exploring a simple closure Capture list. Let’s simplify the code:

var age = 18
let printAge = {
    [age] in
    let temp = age
}
Copy the code

Then look atSILfile:

We can see that the closure is defined as @closure #1 in our main function. Let’s go to the implementation of @closure #1:

// closure #1 in 
sil private @closure #1() - > ()in main : $@convention(thin) (Int) - > () {// %0 "age" // users: %2, %1
bb0(%0 : $Int):
  debug_value %0 : $Int.let, name "age", argno 1 // id: %1
  debug_value %0 : $Int.let, name "temp"         // id: %2
  %3 = tuple ()                                   // user: %4
  return %3 : $()                                 // id: %4
} // end sil function 'closure #1 () -> () in main'
Copy the code

We see an amazing thing. The closure type changes from () -> () to (Int) -> (), and age seems to pass in from the first argument position. We can make it a little bit more complicated to verify:

var age = 18
var name = "Tom"
let printAge = {
    [age, name] (weight: Double) in
    var temp = age
    let tempName = name
}
Copy the code

Show in SIL file:

// closure #1 in 
sil private @closure #1 (Swift.Double) - > ()in main : $@convention(thin) (Double.Int.@guaranteed String) - > () {// %0 "weight" // user: %3
// %1 "age" // users: %7, %4
// %2 "name" // users: %8, %5
bb0(%0 : $Double.%1 : $Int.%2 : $String):
  debug_value %0 : $Double.let, name "weight", argno 1 // id: %3
  debug_value %1 : $Int.let, name "age", argno 2 // id: %4
  debug_value %2 : $String.let, name "name", argno 3 // id: %5
  %6 = alloc_stack $Int.var, name "temp"         // users: %7, %9
  store %1 to %6 : $*Int                          // id: %7
  debug_value %2 : $String.let, name "tempName"  // id: %8
  dealloc_stack %6 : $*Int                        // id: %9
  %10 = tuple ()                                  // user: %11
  return %10 : $()                                // id: %11
} // end sil function 'closure #1 (Swift.Double) -> () in main'
Copy the code

This time, I put two values in the capture list, and the closure itself takes a weight argument. We see that the sil file implementation changed the closure from (Double) -> () to (Double, Int, String) -> (). In this way, the values saved in the capture list are copied as function parameters are passed in.

To summarize the closure Capture List principle: Increase the number of parameters in the closure itself, add parameters of the same type as the values in the closure capture list, and place them after the original closure argument list. Finally, pass the values in the capture list to the function as parameters, and copy the values as the function parameters.

When reference types are used as closures to capture a list of values, we often see [weak self] and [unowned self] used to solve the problem of circular references. In sil files, they are overwritten to arguments with tags, so weak or unreferenced references are done in the function body. I’m not going to take it with me.

In a circular reference, self holds the closure, and the closure holds self. Their lifecycles are mostly the same, so it is better to use [unowned self] to uncircle a circular reference.

  • with[weak self]Later,selfWill become an optional property inselfWhen you call a property or method, add one?It doesn’t look that beautiful. while[unowned self]When self calls a property or method, the increment is not required?.
  • Efficiency issues,[weak self]Will addselfWeak reference counting, and weak reference counting needs to open a new space memorySideTable.SideTableWill store weak reference counts and other reference counts, see detailsSwift’s principle of reference counting. And open space operation relative to conventional operation, performance consumption is more.

fromSIL fileAnalyze the closure capture context

Let’s take an example from the official website to explore:

func makeIncrementer(a)- > () - >Int {
    var runningTotal = 12
    func incrementer(a) -> Int {
        runningTotal + = 1
        return runningTotal
    }
    return incrementer
}
Copy the code

Generate SIL files:

// makeIncrementer()
sil hidden @main.makeIncrementer() -> () -> Swift.Int : $@convention(thin) () -> @owned @callee_guaranteed() - >Int {
bb0:
// Allocate a reference count on the heap @box wrap "runningTotal"
  %0 = alloc_box ${ var Int }, var, name "runningTotal" // users: %8, %7, %6, %1
  %1 = project_box %0 : ${ var Int }, 0           // user: %4
  // Initialize the 12 literal
  %2 = integer_literal $Builtin.Int64.12         // user: %3
  %3 = struct $Int (%2 : $Builtin.Int64) / /user: % 4store% 3toThe % 1:$*Int                          // id/ / : % 4function_ref incrementer# 1 ()in makeIncrementer() // declare the closure @incrementer# 1 () % = 5function_ref @incrementer# 1 () - >Swift.Int in main.makeIncrementer() - > () - >Swift.Int : $@convention(thin) (@guaranteed { var Int}) - >Int // user: %7
  strong_retain %0 : ${ var Int }                 // id: %6
  // Pass the wrapped "runningTotal" to the closure
  %7 = partial_apply [callee_guaranteed] %5(%0) : $@convention(thin) (@guaranteed { var Int}) - >Int // user: %9
  strong_release %0 : ${ var Int }                // id: %8
  // Return the closure
  return %7 : $@callee_guaranteed() - >Int       // id: %9
} // end sil function 'main.makeIncrementer() -> () -> Swift.Int'

// incrementer #1 () in makeIncrementer()
sil private @incrementer #1() - >Swift.Int in main.makeIncrementer() -> () -> Swift.Int : $@convention(thin) (@guaranteed { var Int}) - >Int {
// %0 "runningTotal" // user: %1
bb0(%0 : ${ var Int}) :// Box wrapped "runningTotal" passed in to %1
  %1 = project_box %0 : ${ var Int }, 0           // users: %16, %4, %2
  debug_value_addr %1 : $*Int.var, name "runningTotal", argno 1 // id: %2
  // Add the literal 1
  %3 = integer_literal $Builtin.Int64.1          // user: %8
  %4 = begin_access [modify] [dynamic] %1 : $*Int // users: %13, %5, %15
  %5 = struct_element_addr %4 : $*Int#,Int._value // user: %6
  / / remove the runningTotal ""
  %6 = load %5 : $*Builtin.Int64                  // user: %8
  %7 = integer_literal $Builtin.Int1.-1          // user: %8
  // Call add, add one to "runningTotal"
  %8 = builtin "sadd_with_overflow_Int64"(%6 : $Builtin.Int64.%3 : $Builtin.Int64.%7 : $Builtin.Int1) : $(Builtin.Int64.Builtin.Int1) // users: %10, %9
  %9 = tuple_extract %8 : $(Builtin.Int64.Builtin.Int1), 0 // user: %12
  %10 = tuple_extract %8 : $(Builtin.Int64.Builtin.Int1), 1 // user: %11
  // Determine whether overflow occurs
  cond_fail %10 : $Builtin.Int1."arithmetic overflow" // id: %11
  // assign the calculated value again to the box wrapped "runningTotal"
  %12 = struct $Int (%9 : $Builtin.Int64) / /user: % 13store% 12to %4 : $*Int                         // id: % 14 = 13%tuple(a)end_access %4 : $*Int                           // id: % 16 = 15%begin_access [read] [dynamic] %1 : $*Int  // users: %17, %18 // Open the box value %17 =load %16 : $*Int                          // user19: %end_access %16 : $*Int                          // id: %18 // Return the valuereturn %17 : $Int                               // id19: %} / /end sil function 'incrementer# 1 () - >Swift.Int in main.makeIncrementer() - > () - >Swift.Int'

Copy the code

As you can see from the code, the variable runningTotal is not placed directly on the stack. Instead, it is placed on the heap, making the value type a reference type.

The closure type is changed from () -> Int to (@guaranteed {var Int}) -> Int, and the runningTotal of the reference type is passed in from the parameters.

This gives us an idea of how a closure captures a value, but a closure is essentially an anonymous function, and the bottom line is a pointer to the implementation of a code block. How do you store captured values? We can’t see this in the SIL file, so we need to explore the implementation of closures further down the line.

generateLLVM file

The only lower level of the SIL file are the intermediate LLVM and assembly instructions, both of which can explore the implementation of closures, but the lower the level is less understandable, so I chose LLVM. The syntax of LLVM can be seen in my article.

Let’s start with the simplest code:

struct Test {
    var biBao: (() -> ())
}
Copy the code

Let’s define a structure and put a closure inside it. How does that look in LLVM

%swift.type = type { i64 }
%swift.refcounted = type { %swift.type*. i64 }
%T4main4TestV = typeThe < {%swift.function} >%swift.function = type { i8*. %swift.refcounted*}Copy the code

We see that the closure type in the Test structure is %swift.function

The %swift.refcounted* structure holds i8* and %swift.refcounted*. I8 * is a pointer that we can call void *, and %swift.refcounted* is the pointer to the type %swift.refcounted

The %swift.refcounted structure holds %swift.type* and i64, where i64 is a 64-bit integer and %swift.type* is the pointer to the %swift.type type

%swift.type is a 64-bit integer.

Some of my Metadata articles should quickly realize that %swift.refcounted is a HeapObject and %swift.type is Metadata, as verified by the source code.

We can search swift.type, swift.function and other keywords to see the definition in IR:

FunctionPairTy = createStructType(*this."swift.function", {
    FunctionPtrTy,
    RefCountedPtrTy,
});

RefCountedStructTy =
    llvm::StructType::create(getLLVMContext(), "swift.refcounted");
RefCountedPtrTy = RefCountedStructTy->getPointerTo(/*addrspace*/ 0);

TypeMetadataStructTy = createStructType(*this."swift.type", {
    MetadataKindTy          // MetadataKind Kind;
 });
Copy the code

RefCountedPtrTy may not look obvious, but it is inferred from TypeMetadataStructTy that RefCountedPtrTy is a HeapObject.

To summarize, the bottom layer of a closure is the FunctionPairTy type, expressed in Swift code like this:

struct HeapObject {
    var Kind: UInt64
    var refcount: UInt64
}

struct FunctionPairTy {
    // Address of the function implemented by the closure code
    var FunctionPtrTy: UnsafeMutableRawPointer
    // The pointer to the capture context variable held in the heap space, null if not captured
    var RefCountedPtrTy: UnsafeMutablePointer<HeapObject>}Copy the code

fromLLVM fileThe process of analyzing the value captured by a closure

Let’s use the same demo:

func makeIncrementer(a)- > (() - >Int) {
    var runningTotal = 12
    func incrementer(a) -> Int {
        runningTotal + = 1
        return runningTotal
    }
    return incrementer
}
Copy the code

Generate LLVM file:

define hidden swiftcc { i8*. %swift.refcounted*} @"main.makeIncrementer() -> () -> Swift.Int"(a)# 0 {
entry:
  %runningTotal.debug = alloca %TSi*. align 8
  % 0 = bitcast %TSi** %runningTotal.debug to i8*
  call void @llvm.memset.p0i8.i64(i8* align 8 % 0. i8 0. i64 8. i1 false)
  The % 1 = call noalias %swift.refcounted* @swift_allocObject(%swift.type* getelementptr inbounds (%swift.full_boxmetadata. %swift.full_boxmetadata* @metadata. i32 0. i32 2). i64 24. i64 7) # 1
  % 2 = bitcast %swift.refcounted* The % 1 toThe < {%swift.refcounted. [8 x i8> *]}% 3 = getelementptr inboundsThe < {%swift.refcounted. [8 x i8] }>.The < {%swift.refcounted. [8 x i8> *]}% 2. i32 0. i32 1
  % 4 = bitcast [8 x i8] *% 3 to %TSi*
  store %TSi* % 4. %TSi** %runningTotal.debug. align 8
  %._value = getelementptr inbounds %TSi. %TSi* % 4. i32 0. i32 0
  store i64 12. i64* %._value. align 8
  % 5 = call %swift.refcounted* @swift_retain(%swift.refcounted* returned The % 1) # 1
  call void @swift_release(%swift.refcounted* The % 1) # 1
  % 6 = insertvalue { i8*. %swift.refcounted*} {i8* bitcast (i64 (%swift.refcounted* @ *)"partial apply forwarder for incrementer #1 () -> Swift.Int in main.makeIncrementer() -> () -> Swift.Int" to i8*). %swift.refcounted* undef }. %swift.refcounted* The % 1. 1
  ret { i8*. %swift.refcounted*}% 6
}
Copy the code

If we look for the captured value 12, we can quickly find a statement:

store i64 12. i64* %._value. align 8
Copy the code

The value 12 is stored in %._value. What is %.

%._value = getelementptr inbounds %TSi. %TSi* % 4. i32 0. i32 0
Copy the code

Getelementptr gets the pointer to the element. %TSi refers to i64, so it is clear that %._value takes the pointer to the first element in the %4 structure.

% 4 = bitcast [8 x i8] *% 3 to %TSi*
Copy the code

%4 is %3, and I’m forcing the type here. See how %3 gets:

% 3 = getelementptr inboundsThe < {%swift.refcounted. [8 x i8] }>.The < {%swift.refcounted. [8 x i8> *]}% 2. i32 0. i32 1
Copy the code

The second element of type %2 of the structure {%swift. refm counted, [8 x i8]}, that is, %3 is the pointer to [8 x i8] in the structure {%swift. refm counted, [8 x i8]}, where the value 12 is actually counted. Let’s analyze what’s left;

% 0 = bitcast %TSi** %runningTotal.debug to i8*
  call void @llvm.memset.p0i8.i64(i8* align 8 % 0. i8 0. i64 8. i1 false)
  The % 1 = call noalias %swift.refcounted* @swift_allocObject(%swift.type* getelementptr inbounds (%swift.full_boxmetadata. %swift.full_boxmetadata* @metadata. i32 0. i32 2). i64 24. i64 7) # 1
  % 2 = bitcast %swift.refcounted* The % 1 toThe < {%swift.refcounted. [8 x i8] }>*
  ...
  % 5 = call %swift.refcounted* @swift_retain(%swift.refcounted* returned The % 1) # 1
  call void @swift_release(%swift.refcounted* The % 1) # 1
  % 6 = insertvalue { i8*. %swift.refcounted*} {i8* bitcast (i64 (%swift.refcounted* @ *)"partial apply forwarder for incrementer #1 () -> Swift.Int in main.makeIncrementer() -> () -> Swift.Int" to i8*). %swift.refcounted* undef }. %swift.refcounted* The % 1. 1
  ret { i8*. %swift.refcounted*}% 6
Copy the code

%1 counted the pointer type of % swift_allocObject to the heap and counted the pointer type of %swift.refcounted*. %2 strong-counted the pointer type of %1 to <{%swift.refcounted, [8 x i8]}>* as counted above. Here you can think of it as parent and child class.

%5 is a reference count call, which doesn’t help us much here, so ignore it.

The last %6 is retrun. See that the structure is consistent with the underlying structure of the closure we analyzed. In the structure, I8 * counted {i8* bitcast (i64 (%swift.refcounted*)* @”partial apply forwarder for incrementer #1 () -> swift.int in Main.makeincrementer () -> () -> swif.int “to i8*), % swif.refcounted * undef} counted* is actually inserted into the address of %1. The pointer to {%swift.refcounted, [8 x i8]} of type 12 was put into the {%swift.refcounted, [8 x i8]} count.

So let’s make the code that Swift just expressed a little bit better

struct FunctionPairTy {
    var FunctionPtrTy: UnsafeMutableRawPointer
    var RefCountedPtrTy: UnsafeMutablePointer<Box>}struct HeapObject {
    var Kind: UInt64
    var refcount: UInt64
}

struct Box {
    var refCounted: HeapObject
    var value: Int
}
Copy the code

There is a Box type that wraps the captured value in a reference structure. The value does not have to be an Int, but you can write it as a stereotype

I printed the memory address, successfully found the value 12 in the heap space, and also verified that the first address is not a pointer to the function implementation.

Multiple capture value analysis

Let’s modify the above demo:

func makeIncrementer(a)- > () - >Int {
    var runningTotal = 12
    var temp1 = 1
    let temp2 = 2
    var temp3 = "a"
    let temp4 = "b"
    func incrementer(a) -> Int {
        runningTotal + = 1
        temp1 + = temp2
        temp3 + = temp4
        return runningTotal
    }
    return incrementer
}
Copy the code

We are directly in theLLVM fileTake a look at the frontBoxType in thevalueWhat is stored in:

Counted %swift. Refcounted * Counted %swift. Counted * counted %swift. Counted * counted %swift. Counted * counted %swift. [Box*, Box*, Int, Box*, String]

Let’s verify in memory:

This matches the type of the underlying analysis, but there is a curious point about why some values are wrapped by Box and others are not. A close look at the source shows that if the value is captured and changed within the closure, it is wrapped in Box, and vice versa.

Whether the captured value is wrapped inSil fileCan also be seen in:Closures are implicitly converted in the underlying implementation to see if the parameters are taken{}If you have{}If so, byBoxPackaged.

conclusion

The bottom layer of a closure consists of 16 bytes. The first 8 bytes store a pointer to the address of the function code implementation, which generally points to the code segment. The last 8 bytes store a pointer to the address of the captured value, which generally points to the heap.

The capture Value is stored in the location of Value.

  • If no value is captured,BoxPtrDirectly fornil, does not existValueThe printBoxPtrI’m not going to show it here. A function is a closure with no captured value, so you can try it out for yourself.
  • If there is only one captured value, store the value directlyValueRegardless of whether the capture value has changed within the closure
  • If there are multiple captured values, the values are placed next to each otherValue, but if the capture value changes inside the closure, the value passes throughBoxWrap it again, and then put the referenced address in the wrapperValueThe corresponding position of, can be drawn in a map obvious points:

Always feel capture value this piece of logic has source code, but rummaged for a whole two days did not find, may be my ability is not enough, hope to have a big guy to help, or inform the next really no source code.

Finally, attach the code to check the closure memory, GitHub address, hope to help some students.