instructions

ARKit series of articles directory

simd

SIMD full name Single Instruction Multiple Data stream, can copy Multiple operands, read as [wash eyes ~ bottom]

Taking the addition instruction as an example, after the SISD CPU decodes the addition instruction, the execution unit accesses the memory first and obtains the first operand. Then it accesses the memory again to get the second operand; And then you can do the summation. However, in SIMD cpus, after instruction decoding, several execution units access the memory at the same time, and obtain all operands for operation at one time. This feature makes SIMD especially suitable for data-intensive operations such as multimedia applications.

In the iOS simd

Since iOS11, this operation has been introduced to speed up the processing of graphics and images.

SceneKit keeps the old property names and types, and adds new types in iOS11, starting with simd_ for example SCNNode type properties:

// The old transformation matrix
open var transform: SCNMatrix4

// New simD type matrix
@available(iOS 11.0, *)
open var simdTransform: simd_float4x4

Copy the code

In SceneKit, as long as one of the two changes, the other will also change. The function is the same, but the format is different.

ARKit came out in iOS11 and uses the new type directly, with no prefix before the name, such as in the properties of the ARAnchor type:

/** The transformation of matrix that defines The Anchor's rotation, translation and scale in world coordinates. */
open var transform: matrix_float4x4 { get }
Copy the code

Matrix_float4x4: Matrix_Float4x4: Matrix_Float4x4: MatriX_Float4x4: MatriX_Float4x4

public typealias matrix_float4x4 = simd_float4x4
Copy the code
/ *! @abstract A matrix with 4 rows and 4 columns. */
public struct simd_float4x4 {

    public var columns: (simd_float4, simd_float4, simd_float4, simd_float4)

    public init(a)public init(columns: (simd_float4, simd_float4, simd_float4, simd_float4))
}
Copy the code

There are four simD_float4 vector types, but these are just the names, the real type is Float4, which is also a structure:

/ *! @abstract A vector of four 32-bit floating-point numbers. * @description In C++ and Metal, this type is also available as * simd::float4. The alignment of this type is greater than the alignment * of float; if you need to operate on data buffers that may not be * suitably aligned, you should access them using simd_packed_float4 * instead. The vector consists of four 32-bit floating point numbers. In C++ and Metal, the type is simd::float4. The alignment of this type is greater than that of float; If you need to operate directly on data buffers there may be alignment issues, you should access them via simd_packeD_float4. */
public typealias simd_float4 = float4


/// A vector of four `Float`. This corresponds to the C and
/// Obj-C type `vector_float4` and the C++ type `simd::float4`.
/// Four 'Float' vectors. The corresponding types in C and ObjC are 'vector_float4' and 'simd:: Float4' in C++.
public struct float4 {

    public var x: Float

    public var y: Float

    public var z: Float

    public var w: Float

    /// Initialize to the zero vector.
    public init(a)/// Initialize a vector with the specified elements.
    public init(_ x: Float._ y: Float._ z: Float._ w: Float)

    /// Initialize a vector with the specified elements.
    public init(x: Float, y: Float, z: Float, w: Float)

    /// Initialize to a vector with all elements equal to `scalar`.
    public init(_ scalar: Float)

    /// Initialize to a vector with elements taken from `array`.
    ///
    /// - Precondition: `array` must have exactly four elements.
    public init(_ array: [Float])

    /// Access individual elements of the vector via subscript.
    public subscript(index: Int) - >Float
}
Copy the code

simd_floatN & floatN & simd_packed_floatN

In the case of Float4, there are a total of types associated with it:

public typealias simd_float4 = float4

public typealias vector_float4 = simd_float4

public typealias simd_packed_float4 = float4

Another is the defunct packeD_float4:

public typealias packed_float4 = simd_packed_float4

The Xcode documentation explains this:

These types are based on a Clang feature called “extension vector types” or “OpenCL vector types “(available in C,Objective-C, and C++). There are also some new features that make it easier to use than the traditional SIMD type:

  • I rewrote the basic operators, so I can do vector-vector operations, vector-scalar operations.
  • The child elements can be accessed as an array using the “.” operator and the child element names “x”, “y”, “z”, “w”.
  • There are also named subvectors :.lo and.hi are the first and last parts of the vector, and.even and.odd are the even-digit and odd-digit vector elements, respectively.
  • Clang provides some useful built-in operations to manipulate these vector types:__builtin_shufflevector__builtin_convertvector.
  • A number of vector and matrix operations are defined in the

    header file for these types.
  • You can also use simD types by referring to

    and

The type definition of SIMD vector is as follows:

 simd_charN   where N is 1.2.3.4.8.16.32, or 64.
 simd_ucharN  where N is 1.2.3.4.8.16.32, or 64.
 simd_shortN  where N is 1.2.3.4.8.16, or 32.
 simd_ushortN where N is 1.2.3.4.8.16, or 32.
 simd_intN    where N is 1.2.3.4.8, or 16.
 simd_uintN   where N is 1.2.3.4.8, or 16.
 simd_floatN  where N is 1.2.3.4.8, or 16.
 simd_longN   where N is 1.2.3.4, or 8.
 simd_ulongN  where N is 1.2.3.4, or 8.
 simd_doubleN where N is 1.2.3.4, or 8.
Copy the code

These types have larger alignment intervals than the underlying scalar types; They align the size of the vector [1] with the minimum of the two “maximum alignment interval “[2] in the target platform Clang.

[1] Note that a 3-dimensional vector has the same size as a 4-dimensional vector because the 3-dimensional vector has a hidden channel.

[2] In general, vector widths at the architecture level are 16, 32, or 64 bytes. If you need precise control over alignment, be careful, as this value will vary depending on your compilation Settings.

For simd_typeN types, except for N equals 1 or 3, there are simd_packed_typeN of the corresponding type, which only requires the alignment to match the underlying scalar type. If you need to handle Pointers to scalars or arrays containing scalars, use simd_packed_typeN:

Float *pointerToFourFloats is a pointer to a four-dimensional float array
void myFunction(float *pointerToFourFloats) {
   
    // This is a bug because 'pointerToFourFloats' does not meet the alignment requirements of type simD_FLOAT4; Casts are likely to crash at run time
    simd_float4 *vecptr = (simd_float4 *)pointerToFourFloats;

  
    // Should be converted to type 'simd_packeD_Float4' :
    simd_packed_float4 *vecptr = (simd_packed_float4 *)pointerToFourFloats;
  
    // The alignment of 'simd_packeD_float4' type is the same as' float ', so the conversion is safe and allows us to successfully load a vector.
    // 'simd_packeD_FLOAT4' can be directly assigned to 'simd_float4' without conversion; Their types are different only if they are Pointers or arrays.
    simd_float4 vector = vecptr[0];
}
 
 
Copy the code

All types starting with simd_ in C++ are in the simd:: namespace; For example,simd_char4 can be used with SIMd ::char4. Most of these types match the vector types in Metal Shader Language, except for those that are larger than 4 dimensions, since Metal has no vectors larger than 4 dimensions.

Conversion of types

Conversion in Swift

In swift, there is no pointer conversion problem, whereas simd_floatN & vector_floatN & simd_packed_floatN are essentially floatN types, so the conversion is fairly simple. Simply init() and reconstruct one:

let floats:[Float] = [1.2.3.4]
// Convert directly to type simD_float4, essentially calling the float4.init() method, which can accept multiple types, including the [Float] type parameter
let testVect4:simd_float4 = simd_float4(floats)
// Convert to simd_packeD_FLOAT4 and assign to simd_FLOAT4, essentially calling the float4.init() method
let result2Vect4:simd_float4 = simd_packed_float4(floats)

// Convert simd_packeD_float4 to simd_packeD_float4. The init() method also works
let packedVect4:simd_packed_float4 = simd_packed_float4(floats)
let resultVect4:simd_float4 = packedVect4 as simd_float4

// Direct casts are not allowed, as shown in the following method
let result1111Vect4:simd_float4 = (simd_float4)floats
let result2222Vect4:simd_float4 = floats as! simd_float4
Copy the code

The reverse is the same, calling the init() method

let simdVect4 = simd_float4(1.2.3.4)
// Call the array.init () method of the Array
let array:[Float] = Array(simdVect4);
Copy the code

In the tests, we found that the [Float] type in swift and the simD_FLOAT4 type (i.e., float4) have completely different memory structures:

Conversion in OC

OC has the concept of pointer, so it is necessary to pay attention to the type of pointer.

In OC and C, the pointer type is converted directly, the structure of the data is unchanged, and the structure type is copied after an assignment, and the conversion is complete:

Float floats [4] = {1, 2, 3, 4}; PackedVect4 = (simd_PACKED_FLOAT4 *)floats; packedVect4 = (Simd_PACKED_Float4 *)floats; // *packedVect4 refers to the data pointed to by packedVect4, and then assigns the value to resultVect4. In C language, constant and structure assignment is value passing, so called value passing is to copy out, and in the same time, type conversion is completed resultVect4 = *packedVect4; Simd_packed_float4 and SimD_FLOAT4 are both float4 types in essence, so it doesn't matter which one is used in theory However, Apple gives the sample code that uses the simd_PACKED_FLOAT4 type for intermediation, and specifically notes that their types are different only when used as Pointers or arrays. Therefore, it is safer to use simd_PACKED_FLOAT4 in an OC. Simd_float4 result2Vect4 = *(Simd_PACKED_FLOAT4 *)floats; simd_float4 result3Vect4 = *(simd_float4*)floats;Copy the code

The inverse conversion, because the C language can only initialize arrays with {}, restricts the type conversion and is very inconvenient:

simd_float4 simdVect4 = simd_make_float4(1.2.3.4);
// We can only fetch one value at a time and then initialize the C array
float arr[4] = {simdVect4.x,simdVect4.y,simdVect4.z,simdVect4.w};
float arr2[4] = {simdVect4[0],simdVect4[1],simdVect4[2],simdVect4[3]};
Copy the code

In our tests, we found that the float [4] array in OC has the same memory storage structure as the simD_FLOAT4 type, regardless of memory alignment:

Advantages in use

In WWDC2018, simd type applications are also mentioned. There are several main points:

  • It’s used in 2,3,4 dimensional matrices, and 2,3,4, 8, 16, 32, or 64 dimensional vectors.
  • Vectors and vectors, vectors and headers can be operated with the operators (+,-,*,/).
  • Common vector and geometry operations (dot, length, clamp).
  • Support transcendental functions (e.g. sine,cos).
  • quaternions

For example, the previous practice, normal array is slow, simD is fast:

// Take the average of the two vectors, as we did before
var x:[Float] = [1.2.3.4]
var y:[Float] = [3.3.3.3]
var z = [Float](repeating:0.count:4)
for i in 0..<4 {
    z[i] = (x[i] + y[i]) / 2.0
}



// Do it now
let x = simd_float4(1.2.3.4)
 let y = simd_float4(3.3.3.3)
let z = 0.5 * (x + y)
Copy the code

Focus on the use of quaternion in rotation.

// The vector to be rotated (red dot)
 let original = simd_float3(0.0.1)
  // Rotation axis and Angle
 let quaternion = simd_quatf(angle: .pi / -3,
 axis: simd_float3(1.0.0))
 // Apply rotation (yellow dot)
  let rotatedVector = simd_act(quaternion, original)
Copy the code

In development, we will not only rotate around one axis, it may be two or more complex.

// The vector to be rotated (red dot)
let original = simd_float3(0.0.1)
/ / the axis of rotation
let quaternion = simd_quatf(angle: .pi / -3,
                            axis: simd_float3(1.0.0))
let quaternion2 = simd_quatf(angle: .pi / 3,
                             axis: simd_float3(0.1.0))
// Combine two rotations
let quaternion3 = quaternion2 * quaternion
// Apply rotation (yellow dot)
let rotatedVector = simd_act(quaternion3, original)
Copy the code

Simd supports Interpolation of quaternion with Slerp Interpolation and Spline Interpolation.

// Sloerp Interpolation with spherical Interpolation
let blue = simd_quatf(...) / / blue
let green = simd_quatf(...) / / green
let red = simd_quatf(...) / / a red envelope

for t: Float in stride(from: 0, to: 1, by: 0.001) {
    let q = simd_slerp(blue, green, t) // Interpolate curve from blue to green (shortest spherical distance)
    // Code to Add Line Segment at `q.act(original)`
}

for t: Float in stride(from: 0, to: 1, by: 0.001) {
    let q = simd_slerp_longest(green, red, t) // Interpolate curve from blue to green (longest spherical distance)
    // Code to Add Line Segment at `q.act(original)`
}
Copy the code

// Interpolation is the Interpolation of Spline
let original = simd_float3(0.0.1)
let rotations: [simd_quatf] = ...
for i in 1. rotations.count - 3 {
    for t: Float in stride(from: 0, to: 1, by: 0.001) {
        let q = simd_spline(rotations[i - 1],
                            rotations[i],
                            rotations[i + 1],
                            rotations[i + 2],
                            t)
    }
    // Code to Add Line Segment at `q.act(original)`
}
Copy the code

The difference between the two in the rotational motion is quite obvious, as shown in the vertex trajectory below