Encapsulate bytes in Swift

By Russ Bishop, author of the original text: 2016-05-12 Translator: PMST; Proofreading: Walkingway; Finalized: CMB

Today, I want to try to encapsulate Float32 data into SQLite Binary Large objects ⌈Binary Large Object (BLOB)⌋ group. Of course, I could use JSON, Protobuf, or some other encoding method. In addition, NSNumber, NSArray, NSCoder and PList files are also good choices.

However, I wanted to do it in a more Swift fashion, a bit like C, which is fast and doesn’t introduce any dependencies, and decoders are very simple and can be implemented on any platform.

PointerEncoder

We will implement the final interface in the PointEncoder structure:

Struct PointEncoder {// During decoding, if we get a fairly large number, Static let MaxPoints = 1_310_719 private static let _sizeOfCount = sizeof(int64.self) // A point consists of two Float32 types of data, Private static let _sizeOfPair = 2 * sizeof(float32. self) static func encodePoints(points: [CGPoint]) -> NSData? { static func decodePoints(data: NSData) -> [CGPoint] }Copy the code

The maximum capacity of the points group, MaxPoints, is limited to about 10 MB, which is well enough for the limit in the example. Think of WiFi on a mobile cellular network or in a poor signal environment. Passing this number of points to a server would force the server to disconnect. Of course you can see according to oneself circumstance choose appropriate size.

Next, we need to get the memory size of the above types. The formula is very simple, and once you know how much memory the different types take up, you can define them in one place rather than calling sizeof() all over the place.

Encoding

Let’s look at the implementation of encodePoints

guard ! Count < MaxPoints else {return nil} // Let bufferLength = _sizeOfCount + (points.count * _sizeOfPair) precondition(bufferLength >= (_sizeOfCount + _sizeOfPair), "Empty buffer?" ) precondition(bufferLength < megabytes(10), "Buffer would exceed 10MB")Copy the code

The first step is to ensure that the encoded content is not empty and does not exceed the maximum capacity.

The second step is to calculate the size of the cache area. It should not be too large or too small. Note that isEmpty in step 1 theoretically precludes the cache from being empty, but not necessarily if someone refactors the code later. Next we examine the possibility that the cache allocates too much memory.

This is one of the extra security checks I like to carry out, mainly because of the urinary nature of some second knife programmers. Imagine that someone refactors the code later and accidentally introduces an error, but a good programmer is unlikely to delete the Precondition assertion. The Precondition statement is followed by allocating memory, so keep an eye on “danger may occur here, be extra careful!” .

let rawMemory = UnsafeMutablePointer.alloc(bufferLength) // Failed to allocate memory guard rawMemory ! = nil else { return nil }Copy the code

The next step is to actually create the cache and jump out if it fails.

It is very difficult for control programs to deal with out-of-memory situations. If a class instance fails to be created because it is out of memory, the abort() method should be called because a simple log or print statement still involves some memory allocation and cannot be logged to inform the failure (which would cause all constructors to fail).

Consider another case where allocating a large cache may fail, but there may be additional memory available in the heap fragmentation. Therefore, handling it elegantly is an art (especially in a constrained environment like iOS).

UnsafeMutablePointer(rawMemory).memory = Int64(points.count)

let buffer = UnsafeMutablePointer(rawMemory + sizeOfCount)Copy the code

There’s a caveat here. The right-hand side of the equation converts the points.count type to a 64-bit integer, so it does not change platform by platform. (Swift’s Int type is platform-sensitive when compiled, being a 32-bit integer for 32-bit platforms and a 64-bit integer for 64-bit platforms.) We don’t want users to have crashes or data corruption issues after upgrading their devices.

On the left side of the equation, force rawMemory to Int64, and then assign the memory it points to Int64(points.count). 64-bit integers account for eight bytes, so the first eight bytes allocated contain sizeOfCount information.

Finally, we offset the pointer by 8 bytes (as mentioned earlier) to the first address of the cache.

for (index, point) in points.enumerate() {
  let ptr = buffer + (index * 2)

  // Store the point values.
  ptr.memory = Float32(point.x)
  ptr.advancedBy(1).memory = Float32(point.y)
}Copy the code

The next step is to iterate through the points group. We perform a simple offset calculation on the UnsafeMutablePointer pointer to get the relevant position in the buffer. Note that unsafe Pointers in SWIFT only know the size of the type currently in use, and all pointer offsets are in the current type, not bytes! (The Void pointer, however, cannot determine the size of the type, so this is done in bytes.)

Therefore, by summing the index * 2 offset of the base address, the addresses of the next pair of point members (note: x, y coordinates) are obtained. We then assign the memory region to which the current pointer points.

I then used the ptr.advancedby () method, leaving no reference to the pointer and not setting PTR as a mutable pointer. It’s just a personal preference. You can use either + or advancedBy(), which work the same way.

return NSData(
  bytesNoCopy: rawMemory, 
  length: bufferLength, 
  deallocator: { (ptr, length) in
    ptr.destroy(length)
    ptr.dealloc(length)
})Copy the code

Finally, notice that we return the data to the caller. At this point, an appropriate cache has been allocated, and bytesNoCopy is then used to initialize, passing the appropriate length and closure as arguments to the function.

Why pass a closure argument (deallocator) that is used for release? Technically, perhaps you can use the NSData (implementation narrowly bytesNoCopy: length: freeWhenDone:), but there is no guarantee that no accident happened. If Swift Runtime does not use the default malloc/free method, but uses another memory allocation method, you will get an error.

If our cache happens to need to store some complex Swift types, a timely release is necessary: you must call ptr.destroy(count) to do the release, using reference types, recursive enumerations, and so on, otherwise it will cause a memory leak. In this case, we know the number of bits in Float32 and Int64, and technically speaking, calling destroy is a better way to ensure this.

Decoding

guard data.bytes ! = nil && data.length > (_sizeOfCount + _sizeOfPair) else { return [] }Copy the code

First, we make sure that the pointer in NSData is not nil and is big enough to hold Int64 number of points. This paves the way for further operations without the need for additional safety checks.

Bytes let buffer = rawMemory + _sizeOfCount // Number of points of type Int64 from memory let pointCount64 = UnsafePointer(rawMemory).memory precondition( Int64(MaxPoints) < Int64(Int32.max), "MaxPoints would overflow on 32-bit platforms") precondition( pointCount64 > 0 && pointCount64 < Int64(MaxPoints), "Invalid pointCount = \(pointCount64)") let pointCount = Int(pointCount64)Copy the code

Next, set our pointer. We cast the original pointer to Int64 again, using a nonmutable pointer for read-only purposes.

Notice that in the previous code I set the dot type to 64 bits, which ensures that int32.max does not overflow or underflow; In C, if(value + x > INT_MAX) is often used to check whether an overflow occurs, which is one of the undefined behaviors. Now stop and think for a minute: How does a computer handle value + x exceeding the maximum value of an integer? The answer is: it can’t add up and becomes a negative value instead. So what happens when we do something like malloc or is_admin() with a very large negative value? This is a little homework assignment I left for the reader.

The last line of code converts the number of points to Int. Once the value goes above INT32.max on 32-bit platforms, we’re in a “dead end” situation. Swift is much more secure than C — we must always be on the lookout for value overflows or underflows. Once this happens, the program will crash at runtime, and thankfully, it will give a clear error message before it dies.

On 64-bit platforms, it is definitely possible to exceed the 4GB capacity points set (values over approximately 4.2 billion) and the code will need to be refactored further. But it doesn’t matter for my needs, so it’s hard coded to limit capacity. This also prevents values created on a 64-bit system from being loaded on a 32-bit system (this is the theoretical maximum; the actual capacity I use will be much smaller).

var points: [CGPoint] = [] points.reserveCapacity(pointCount) for ptr in (0.. (buffer) + (2 * $0) }) { points.append( CGPoint( x: CGFloat(ptr.memory), y: CGFloat(ptr.advancedBy(1).memory)) ) } return pointsCopy the code

The code is simple. We set the array’s spare capacity to avoid reallocation. This doesn’t affect performance too much, since we already know the maximum capacity limit, so it’s ok to do this.

In addition, the pointer type is Float32, and Swift knows how much memory this type occupies. We simply multiply the index value by 2 (2 * $0) to get a pointer to the next pair of coordinate points, and then read the value from the memory region to which the pointer points.

About the test

Of course, a similar type should use the Address Sanitizer to help catch any memory abuse issues, and extensive code reviews should be done before a product is released (or AFL Fuzzing can also help uncover problems).

I can never be 100% sure that any thread or memory part of my code won’t have bugs. I’m not even 100% sure this use case is bug-free. I’ve found no problems with Addess Sanitizer, but I firmly believe that a good programmer should be in awe. Always be on the lookout for errors or missteps (if you find anything wrong with this article, please let me know!)

No one, including you, is good enough to write code that can completely avoid buffer overflows.

conclusion

The Swift compiler has always taken security seriously, but it can be chilling at times. If you promise not to do something naughty, it will trust you completely. If you need to do some byte or void pointer manipulation, create a new.swift file and use it inside.

Finally realize

I’ve embedded bullet points and detailed comments in the final implementation of the use case gist. Feel free to use it if it helps you.

// Written by Russ Bishop // MIT licensed, use freely. // No warranty, not suitable for any purpose. Use at your own risk! struct PointEncoder { // When parsing if we get a wildly large value we can // assume denial of service or corrupt data.  static let MaxPoints = 1_310_719 // How big an Int64 is private static let _sizeOfCount = sizeof(Int64.self) // How big  a point (two Float32s are) private static let _sizeOfPair = 2 * sizeof(Float32.self) static func encodePoints(points: [CGPoint]) -> NSData? { guard ! points.isEmpty && points.count < MaxPoints else { return nil } // Total size of the buffer let bufferLength = _sizeOfCount + (points.count * _sizeOfPair) precondition(bufferLength >= (_sizeOfCount + _sizeOfPair), "Empty buffer?" ) precondition(bufferLength < megabytes(10), "Buffer would exceed 10MB") let rawMemory = UnsafeMutablePointer.alloc(bufferLength) // Failed to allocate memory guard rawMemory ! = nil else { return nil } // Store the point count in the first portion of the buffer UnsafeMutablePointer(rawMemory).memory = Int64(points.count) // The remaining bytes are for the Float32 pairs let buffer  = UnsafeMutablePointer(rawMemory + _sizeOfCount) // Store the points for (index, point) in points.enumerate() { // Since buffer is UnsafeMutablePointer, addition counts // the number of Float32s, *not* the number of bytes! let ptr = buffer + (index * 2) // Store the point values. ptr.memory = Float32(point.x) ptr.advancedBy(1).memory = Float32(point.y) } // We can tell NSData not to bother copying memory. // For consistency and since we can't guarantee the memory allocated // by UnsafeMutablePointer can just be freed, we provide a deallocator // block. return NSData( bytesNoCopy: rawMemory, length: bufferLength, deallocator: { (ptr, length) in // If ptr held more complex types, failing to call // destroy will cause lots of leakage. // No one wants leakage. ptr.destroy(length) ptr.dealloc(length) }) } static func decodePoints(data: NSData) -> [CGPoint] { // If we don't have at least one point pair // and a size byte, bail. guard data.bytes ! = nil && data.length > (_sizeOfCount + _sizeOfPair) else { return [] } let rawMemory = data.bytes let buffer = rawMemory  + _sizeOfCount // Extract the point count as an Int64 let pointCount64 = UnsafePointer(rawMemory).memory // Swift is safer than C here; you can't // accidentally overflow/underflow and not // trigger a trap, but I am still checking // to provide better error messages. // In all cases, better to kill the process // than corrupt memory. precondition( Int64(MaxPoints) < Int64(Int32.max), "MaxPoints would overflow on 32-bit platforms") precondition( pointCount64 > 0 && pointCount64 < Int64(MaxPoints), "Invalid pointCount = \(pointCount64)") // On 32-bit systems this would trap if // MaxPoints were too big and we didn't // check above. let pointCount = Int(pointCount64) precondition( _sizeOfPair + (_sizeOfCount * pointCount) <= data.length, "Size lied or buffer truncated") var points: [CGPoint] = [] // Small optimization since // we know the array size points.reserveCapacity(pointCount) for ptr in (0.. (buffer) + (2 * $0) }) { points.append( CGPoint( x: CGFloat(ptr.memory), y: CGFloat(ptr.advancedBy(1).memory)) ) } return points } } func kilobytes(value: Int) -> Int { return value * 1024 } func megabytes(value: Int) -> Int { return kilobytes(value * 1024) } func gigabytes(value: Int) -> Int { return megabytes(value * 1024) }Copy the code

This article is translated by SwiftGG translation team and has been authorized to be translated by the authors. Please visit swift.gg for the latest articles.

PointerEncoder

Encoding

Decoding

About the test

conclusion

Finally realize

Related Posts

The Runtime is an important part of the job interview.

Ios-crash collection and analysis

Compile and debug objC4-779.1 source code under the latest macOS 10.15