ArrayThe type of

Create a new project and write the simplest possible Demo:

var num: Array<Int> = [1.2.3]
Copy the code

Let’s look at the Array definition:

@frozen public struct Array<Element> {
	.
}
Copy the code

Obviously, Array is, by definition, a struct type, which is a value type.

Struct (num); struct (num);

var num: Array<Int> = [1.2.3]
withUnsafePointer(to: &num) {
    print($0)}print("end")
Copy the code

We can debug the print method by setting a breakpoint:

There is no information about Array values 1, 2, and 3, only 0x000000010076f400 in memory, looks like an address on the heap, so there is a problem:

  • ArrayWhat is the address saved?
  • ArrayWhere did the data go?
  • ArrayHow is write-time replication implemented for?

generateArraytheSILfile

We cut the code down to num (the simpler, the clearer) and generate the SIL file to look at:

sil @main : $@convention(c) (Int32.UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8- > > > >)Int32 {
bb0(%0 : $Int32.%1 : $UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>):
  alloc_global @main.num : [Swift.Int]               // id: %2
  %3 = global_addr @main.num : [Swift.Int] : $*Array<Int> // user: %23
  // Array has 3 elements
  %4 = integer_literal $Builtin.Word.3           // user: %6
  // Array generation method
  // function_ref _allocateUninitializedArray<A>(_:)
  %5 = function_ref @Swift._allocateUninitializedArray<A>(Builtin.Word) - > ([A].Builtin.RawPointer) : $@convention(thin) <Tau _0_0> (Builtin.Word) - > (@owned Array< tau _0_0 >,Builtin.RawPointer) // user: %6
  %6 = apply %5<Int>(%4) : $@convention(thin) <Tau _0_0> (Builtin.Word) - > (@owned Array< tau _0_0 >,Builtin.RawPointer) // users: %8, %7
  %7 = tuple_extract %6 : $(Array<Int>, Builtin.RawPointer), 0 // user: %23
  %8 = tuple_extract %6 : $(Array<Int>, Builtin.RawPointer), 1 // user: %9
  %9 = pointer_to_address %8 : $Builtin.RawPointer to [strict] $*Int // users: %12, %19, %14
   // literal 1
  %10 = integer_literal $Builtin.Int64.1         // user: %11
  %11 = struct $Int (%10 : $Builtin.Int64) / /user: %12 // Save 1 to %9store% 11to %9 : $*Int                         // id: % 12% = 13integer_literal $Builtin.Word1 / /user: %14 // %9 offset by 1 step %14 =index_addr %9 : $*Int, %13 : $Builtin.Word // user: %17 // Literal 2 %15 =integer_literal $Builtin.Int642 / /user: % 16 = 16%struct $Int (%15 : $Builtin.Int64) / /user: %17 // Save 2 to %14store% 16to %14 : $*Int                        // id: 18 = % 17%integer_literal $Builtin.Word2 / /user: %19 // %9 offset by 2 steps %19 =index_addr %9 : $*Int, %18 : $Builtin.Word // user: %22 // Literal 3 %20 =integer_literal $Builtin.Int64And 3 / /user: 21% % = 21struct $Int (%20 : $Builtin.Int64) / /user: %22 // Save 3 to %19store% 21to %19 : $*Int                        // id: % 22store% 7to %3 : $*Array<Int>                   // id: % 24 = 23%integer_literal $Builtin.Int32, 0 / /user: % 25 = 25%struct $Int32 (%24 : $Builtin.Int32) / /user: % 26return %25 : $Int32                             // id: % 26} / /end sil function 'main'
Copy the code

From the above analysis, the generation of the num calls _allocateUninitializedArray < A > (_) method, this method is the return value is A tuple % 6, and then use % 7, 8 the tuples % % 6 values extracted out, % 7 to % 3, also is the location of the num, The value 0x000000010076f400 is the value of %7, and %9 is the address of %8, so what is %7 and %8?

ArrayDefinition in the source code

We can’t find the SIL file, so we have to look at the source code.

@frozen
public struct Array<Element> :_DestructorSafeContainer {
  #if _runtime(_ObjC)
  @usableFromInline
  internal typealias _Buffer = _ArrayBuffer<Element>
  #else
  @usableFromInline
  internal typealias _Buffer = _ContiguousArrayBuffer<Element>
  #endif

  @usableFromInline
  internal var _buffer: _Buffer

  /// Initialization from an existing buffer does not have "array.init"
  /// semantics because the caller may retain an alias to buffer.
  @inlinable
  internal init(_buffer: _Buffer) {
    self._buffer = _buffer
  }
}
Copy the code

There’s really only one property in Array _buffer, _buffer under _Runtime (_ObjC) is _ArrayBuffer, otherwise it’s _ArrayBuffer. On Apple devices it should all be ObjC compatible, so it should be _ArrayBuffer.

We directly breakpoint what value _buffer was assigned to.

_allocateUninitializedArray

In the source code, we search the SIL file initialization method in _allocateUninitializedArray, we see the following definition:

@inlinable // FIXME(inline-always)
@inline(__always)
@_semantics("array.uninitialized_intrinsic")
public // COMPILER_INTRINSIC
func _allocateUninitializedArray<Element> (_  builtinCount: Builtin.Word)- > (Array<Element>, Builtin.RawPointer) {
  let count = Int(builtinCount)
  if count > 0 {
    // Doing the actual buffer allocation outside of the array.uninitialized
    // semantics function enables stack propagation of the buffer.
    let bufferObject = Builtin.allocWithTailElems_1(
      _ContiguousArrayStorage<Element>.self, builtinCount, Element.self)

    let (array, ptr) = Array<Element>._adoptStorage(bufferObject, count: count)
    return (array, ptr._rawValue)
  }
  // For an empty array no buffer allocation is needed.
  let (array, ptr) = Array<Element>._allocateUninitialized(count)
  return (array, ptr._rawValue)
}
Copy the code

So you can see there’s a different way to determine if count is greater than zero, but the return type is the same, and we just want to figure out what the data structure looks like, so we’ll just look at one of them. Count in my example is 3, so look at the conditional statement.

AllocWithTailElems_1 is called, but the object is Builtin, so it’s not easy to see how the method is implemented, but we can debug with a breakpoint.

The swift_allocObject method was entered when debugging:

HeapObject *swift::swift_allocObject(HeapMetadata const *metadata,
                                     size_t requiredSize,
                                     size_t requiredAlignmentMask) {
  CALL_IMPL(swift_allocObject, (metadata, requiredSize, requiredAlignmentMask));
}
Copy the code

The requiredSize value displayed by the breakpoint is 56, and the Po pointer metadata displays _TtGCs23_ContiguousArrayStorageSi_$. We can know that allocWithTailElems_1 applies to allocate a piece of heap space, and the object type of the application is _ContiguousArrayStorage

After space is allocated, the _adoptStorage method is called:

  /// Returns an Array of `count` uninitialized elements using the
  /// given `storage`, and a pointer to uninitialized memory for the
  /// first element.
  ///
  /// - Precondition: `storage is _ContiguousArrayStorage`.
  @inlinable
  @_semantics("array.uninitialized")
  internal static func _adoptStorage(
    _ storage: __owned _ContiguousArrayStorage<Element>.count: Int
  )- > (Array.UnsafeMutablePointer<Element>) {

    let innerBuffer = _ContiguousArrayBuffer<Element>(
      count: count,
      storage: storage)

    return (
      Array(
        _buffer: _Buffer(_buffer: innerBuffer, shiftedToStartIndex: 0)),
        innerBuffer.firstElementAddress)
  }
Copy the code

The _adoptStorage method returns an element containing the innerBuffer. What is the innerBuffer in the element?

The innerBuffer is generated using the _ContiguousArrayBuffer initialization method, so let’s look at the definition of _ContiguousArrayBuffer:

internal struct _ContiguousArrayBuffer<Element> :_ArrayBufferProtocol {
    @inlinable
    internal init(count: Int.storage: _ContiguousArrayStorage<Element>) {
        _storage = storage
        
        _initStorageHeader(count: count, capacity: count)
    }
    
    @inlinable
    internal func _initStorageHeader(count: Int.capacity: Int) {
        #if _runtime(_ObjC)
        let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
        #else
        let verbatim = false
        #endif
        
        // We can initialize by assignment because _ArrayBody is a trivial type,
        // i.e. contains no references.
        _storage.countAndCapacity = _ArrayBody(
            count: count,
            capacity: capacity,
            elementTypeIsBridgedVerbatim: verbatim)
    }
    
    @usableFromInline
    internal var _storage: __ContiguousArrayStorageBase
}

Copy the code

_ContiguousArrayBuffer has only one attribute _storage, initialization method init(count: Int, storage: The storage passed in _ContiguousArrayStorage

) is _ContiguousArrayStorage, __ContiguousArrayStorageBase is _ContiguousArrayStorage parent class.

_ContiguousArrayStorage

_ContiguousArrayStorage is a class, went through the entire _ContiguousArrayStorage inheritance chain, found only in __ContiguousArrayStorageBase has a property:

final var countAndCapacity: _ArrayBody
Copy the code

So what is _ArrayBody

@frozen
@usableFromInline
internal struct _ArrayBody {
  @usableFromInline
  internal var _storage: _SwiftArrayBodyStorage
  
  .
}
Copy the code

_ArrayBody is a structure with only one property, _storage

What is swiftarrayBodyStorage?

struct _SwiftArrayBodyStorage {
  __swift_intptr_t count;
  __swift_uintptr_t _capacityAndFlags;
};
Copy the code

Count and _capacityAndFlags are pointer sizes in Swift, which are 8 bytes.

We organize the memory structure of the class _ContiguousArrayStorage, _ContiguousArrayStorage itself is a class, so there is a metadata, and then _ContiguousArrayStorage has only one attribute:

final var countAndCapacity: _ArrayBody
Copy the code

And _ArrayBody has only one property:

@usableFromInline
  internal var _storage: _SwiftArrayBodyStorage
Copy the code

It might be a little bit clearer to draw a picture

_initStorageHeader

After talking about the structure of _ContiguousArrayStorage, we go back to the initialization method of _ContiguousArrayBuffer, which is called when _storage is assigned:

_initStorageHeader(count: count, capacity: count)
Copy the code

_initStorageHeader (); _initStorageHeader ();

_storage.countAndCapacity = _ArrayBody(
            count: count,
            capacity: capacity,
            elementTypeIsBridgedVerbatim: verbatim)
Copy the code

The _initStorageHeader method simply assigns the countAndCapacity property of _storage.

Now let’s see how _ArrayBody is initialized:

@inlinable
  internal init(
    count: Int.capacity: Int.elementTypeIsBridgedVerbatim: Bool = false
  ) {
    _internalInvariant(count > = 0)
    _internalInvariant(capacity > = 0)
    
    _storage = _SwiftArrayBodyStorage(
      count: count,
      _capacityAndFlags:
        (UInt(truncatingIfNeeded: capacity) & < < 1) |
        (elementTypeIsBridgedVerbatim ? 1 : 0))}Copy the code

Capacity (_capacityAndFlags) is not stored in memory, but is shifted 1 bit to the left Then in the extra one data record a elementTypeIsBridgedVerbatim flag. Therefore, if we read capacity in memory, we also need to do displacement, which is reflected in the _ArrayBody source code

  /// The number of elements that can be stored in this Array without
  /// reallocation.
  @inlinable
  internal var capacity: Int {
    return Int(_capacityAndFlags & > > 1)}Copy the code

_ArrayBuffer

Initialization of the innerBuffer is done, so back to generating the return value:

return (
    Array(
      _buffer: _Buffer(_buffer: innerBuffer, shiftedToStartIndex: 0)),
      innerBuffer.firstElementAddress)
Copy the code

Array(_buffer:) is the default constructor initialization method. _buffer is the default constructor initialization method.

I’ll paste the related initialization methods of _ArrayBuffer together for a better look:

@usableFromInline
@frozen
internal struct _ArrayBuffer<Element> :_ArrayBufferProtocol {
.
@usableFromInline
internal var _storage: _ArrayBridgeStorage
}

extension _ArrayBuffer {
/// Adopt the storage of `source`.
@inlinable
internal init(_buffer source: NativeBuffer.shiftedToStartIndex: Int) {
  _internalInvariant(shiftedToStartIndex = = 0."shiftedToStartIndex must be 0")
  _storage = _ArrayBridgeStorage(native: source._storage)
}
.
}

@usableFromInline
internal typealias _ArrayBridgeStorage
= _BridgeStorage<__ContiguousArrayStorageBase>

@frozen
@usableFromInline
internal struct _BridgeStorage<NativeClass: AnyObject> {
  @inlinable
    @inline(__always)
    internal init(native: Native) {
      _internalInvariant(_usesNativeSwiftReferenceCounting(NativeClass.self))
      rawValue = Builtin.reinterpretCast(native)
    }
}

Copy the code

At the end there’s nothing fancy, the properties are structure-value types, structure-value types have only one property, and then the assignment operation.

The zero passed in shiftedToStartIndex does nothing for our understanding, but makes a judgment.

To sum up, %7 in the SIL file is the structure of _ArrayBuffer, which has an attribute that stores the instance class object of _arrayStorage

firstElementAddress

Now find the SIL file % 8, namely innerBuffer. FirstElementAddress:

/// A pointer to the first element.
  @inlinable
  internal var firstElementAddress: UnsafeMutablePointer<Element> {
    return UnsafeMutablePointer(Builtin.projectTailElems(_storage,
                                                         Element.self))}Copy the code

Builtin (Builtin, Builtin, Builtin, Builtin, Builtin);

/// projectTailElems : <C,E> (C) -> Builtin.RawPointer
///
/// Projects the first tail-allocated element of type E from a class C.
BUILTIN_SIL_OPERATION(ProjectTailElems, "projectTailElems", Special)
Copy the code

So this operation will return the first address of the tail element of the _storage allocated space, so presumably, the stored location of the array element will be after the content of _ContiguousArrayStorage

validationArrayThe underlying structure of

Same code as above:

var num: Array<Int> = [1.2.3]
withUnsafePointer(to: &num) {
    print($0)}print("end")
Copy the code

inprintBreak point, outputnumMemory: 0x0000000100604b30Should be_ContiguousArrayStorage“, continue to output0x0000000100604b30Memory:

Perfect match, Nice

ArrayCopy at write time

Let’s start with what copy on write means: variables are copied only if they need to be changed, and if they don’t change, everyone shares a single memory. In the Swift standard library, collection types such as Array, Dictionary, and Set are implemented using copy-on-write techniques

Let’s take a look at the source code to do this. Run the following code in the source code:

var num: Array<Int> = [1.2.3]
var copyNum = num
num.append(4)
Copy the code

Then put a breakpoint on the source’s append method:

@inlinable
  @_semantics("array.append_element")
  public mutating func append(_ newElement: __owned Element) {
    // Separating uniqueness check and capacity check allows hoisting the
    // uniqueness check out of a loop.
    _makeUniqueAndReserveCapacityIfNotUnique()
    let oldCount = _getCount()
    _reserveCapacityAssumingUniqueBuffer(oldCount: oldCount)
    _appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
  }

Copy the code

Here a total of three methods, we see the first _makeUniqueAndReserveCapacityIfNotUnique first, look at the method name to understand, is that if the array is not the only, so that is the only and reserve capacity. So what does this only refer to?

Let’s have a look at the breakpoint debugging, because it is a bit deep, I copy the most critical code:

return !getUseSlowRC() &&!getIsDeiniting() && getStrongExtraRefCount() = =0;
Copy the code

These are all reference-count judgments, the most important being whether the getStrongExtraRefCount strong reference count is 0. If it’s not zero, it’s not unique, so unique here means unique references to this space.

What, if not the only one?

@inlinable
  @_semantics("array.make_mutable")
  internal mutating func _makeUniqueAndReserveCapacityIfNotUnique(a) {
    if _slowPath(!_buffer.isMutableAndUniquelyReferenced()) {
      _createNewBuffer(bufferIsUnique: false,
                       minimumCapacity: count + 1,
                       growForAppend: true)}}@_alwaysEmitIntoClient
  @inline(never)
  internal mutating func _createNewBuffer(
    bufferIsUnique: Bool.minimumCapacity: Int.growForAppend: Bool
  ) {
    let newCapacity = _growArrayCapacity(oldCapacity: _getCapacity(),
                                         minimumCapacity: minimumCapacity,
                                         growForAppend: growForAppend)
    let count = _getCount()
    _internalInvariant(newCapacity > = count)
    
    let newBuffer = _ContiguousArrayBuffer<Element>(
      _uninitializedCount: count, minimumCapacity: newCapacity)

    if bufferIsUnique {
      _internalInvariant(_buffer.isUniquelyReferenced())

      // As an optimization, if the original buffer is unique, we can just move
      // the elements instead of copying.
      let dest = newBuffer.firstElementAddress
      dest.moveInitialize(from: _buffer.firstElementAddress,
                          count: count)
      _buffer.count = 0
    } else {
      _buffer._copyContents(
        subRange: 0..<count,
        initializing: newBuffer.firstElementAddress)
    }
    _buffer = _Buffer(_buffer: newBuffer, shiftedToStartIndex: 0)}Copy the code

We see that the _createNewBuffer method is called and a new buffer is generated in the _createNewBuffer method:

let newBuffer = _ContiguousArrayBuffer<Element>(
      _uninitializedCount: count, minimumCapacity: newCapacity)
Copy the code

So this is a little bit like before, so I’m not going to expand it, but it’s a little bit easier to understand, but it’s basically creating a new space for the array that’s being modified.

So, the essence of the copy-on-write technique is to check the strong reference count of _ContiguousArrayStorage:

  • Create a new arraynum._ContiguousArrayStorageThe strong reference count of the
  • At the moment the arraynumAdd elements, discover_ContiguousArrayStorage< span style = “max-width: 100%; clear: both; min-height: 1pt;
  • When usingcopyNumCopy the arraynumWhen, but thenumthe_ContiguousArrayStorageCopy to thecopyNum.copyNumthe_ContiguousArrayStoragewithnumthe_ContiguousArrayStorageIt’s the same, but_ContiguousArrayStorageThe strong reference count of the Because there is no new space opened up here, very performance saving.
  • At the moment the arraynumAdd the element again and discover_ContiguousArrayStorageIf the strong reference count of is 1, it indicates that it is not the only reference. Create a new space and create a new one_ContiguousArrayStorage, copy the contents of the original array into the new space.

We can be on our ownXcodeCheck memory to verify

conclusion

Swift’s array is a struct, but the internal implementation is still a reference type, and the contents of the array are still stored in the heap space.

The feature of swift’s array write copy is to determine whether it is a unique reference according to the reference count of the heap space. When the array changes, it detects that it is not a unique reference, and then the real copy begins.

Finally, I put the swift array structure also used swift code to achieve a again, can be downloaded from GitHub.