I. Description of main functions

  • Create a sessionuseVTCompressionSessionCreate()To create a session.
VTCompressionSessionCreate (/ / specified distributor, if set to NULL, CFAllocatorRef _Nullable allocator, // Video image int32_t width, // video image int32_t height, // Code type CMVideoCodecType codecType, CFDictionaryRef _Nullable encoderSpecification; // Source pixel buffer attributes; CFDictionaryRef _NullablesourceImageBufferAttributes, // Compressed data allocator, if set to NULL, CFAllocatorRef _Nullable compressedDataAllocator / / callback function pointer VTCompressionOutputCallback _Nullable outputCallback, / / callback function reference data, Void * _Nullable outputCallbackRefCon, VTCompressionSessionRef _Nullable * _Nonnull compressionSessionOut)Copy the code

This function has a return value of type OSStatus. If noErr is returned, the creation is successful.

  • Sets the properties of the encoding sessionuseVTSessionSetProperty()To complete the setting of encoding properties.
VTSessionSetProperty(VTSessionRef _Nonnull session, // The encoding session to set CFStringRef _Nonnull propertyKey, // property key CFTypeRef _Nullable propertyValue // propertyValue)Copy the code

This function also has a return value of type OSStatus. If noErr is returned, the property is set successfully.

  • Ready to codeuseVTCompressionSessionPrepareToEncodeFrames()
VTCompressionSessionPrepareToEncodeFrames (VTCompressionSessionRef _Nonnull session / / prepare coding session)Copy the code

Similarly, if this function returns noErr, the execution is successful.

  • coding

Using VTCompressionSessionEncodeFrame coding () function to operate.

VTCompressionSessionEncodeFrame( VTCompressionSessionRef _Nonnull session, CVImageBufferRef _Nonnull imageBuffer, // The image data frame to be encoded CMTime presentationTimeStamp, // the presentation time of the frame, Each timestamp must be greater than the previous timestamp CMTime duration, // The duration of this frame, if there is no duration, KCMTimeInvalid CFDictionaryRef _Nullable frameProperties, void * _NullablesourceVTEncodeInfoFlags * _Nullable infoFlagsOut // The address used to receive the encoded operation information)Copy the code

If this function returns noErr, the encoding is successful.

  • The end of the coding
  1. Using VTCompressionSessionCompleteFrames () function to complete all pending compulsory frame.

    VTCompressionSessionCompleteFrames( VTCompressionSessionRef _Nonnull session, / / do this session CMTime completeUntilPresentationTimeStamp / / complete frame coding time stamp, if transfer kCMTimeInvalid, would be processed frame needs to be handled at the back)Copy the code

    Return the same as above.

  2. Using VTCompressionSessionInvalidate () function to set the coding session expires.

    VTCompressionSessionInvalidate (VTCompressionSessionRef _Nonnull session / / going to failure of coding session)Copy the code

    Return the same as above

  • Encode the callback function

VideoToolBox defines a VTCompressionOutputCallback type structure, we need a function according to a statement from the definition to get correction information.

The structure is as follows:

Typedef void void * (* VTCompressionOutputCallback) CM_NULLABLE outputCallbackRefCon, / / create a session of the incoming void * CM_NULLABLE reference datasourceOSStatus status, // Encoding status VTEncodeInfoFlags infoFlags, CM_NULLABLE CMSampleBufferRef sampleBuffer;Copy the code

Ii. Coding process

Three. Concrete implementation

1. Create an encoding session

int32_t width = 480; // Video image width int32_t height = 640; // Video image high VTCompressionSessionRef encodeSesion; OSStatus status = VTCompressionSessionCreate (kCFAllocatorDefault, / / here we use the default distributor width, height, KCMVideoCodecType_H264, // H264 encoding mode NULL, // by the system's own choice of encoding specification NULL, // create NULL, // use the default allocator VideoEncodeCallback, // The name of the self-defined callback function (__bridge void * _Nullable)(self), // here we pass self &encodeSesion);if(status ! = noErr) { NSLog(@"Session create failed. status=%d", (int)status);
}
Copy the code

2. Set encoding properties

Common attributes

  • kVTCompressionPropertyKey_RealTime

    This property indicates whether to code in real time and has a value of CFBoolean.

  • kVTCompressionPropertyKey_ProfileLevel

    This attribute indicates the efficiency level of the encoding. Generally, pass kVTProfileLevel_H264_Baseline_AutoLevel

  • kVTCompressionPropertyKey_AllowFrameReordering

    This property indicates whether reordering of frames is allowed. If B frames are recoded, the encoder must reorder the frames. So this property can be understood indirectly as whether B frames are produced. The value of this property is a CFBoolean type.

  • kVTCompressionPropertyKey_MaxKeyFrameInterval

    This property represents the interval between I frames, i.e., GOP. If this value is set too high, the image will be blurred. The value of this attribute is of type CFNumberRef

  • kVTCompressionPropertyKey_ExpectedFrameRate

    This property represents the expected encoded frame rate, which is the FPS. This setting does not control the frame rate; the actual frame rate depends on the duration of the frame and may vary. The value of this attribute is of type CFNumberRef.

  • kVTCompressionPropertyKey_AverageBitRate

    This property represents the average bit rate in BPS. With a large bit rate, the picture will be very clear, but the file will also be large. With a small bit rate, the image is sometimes blurry. The value of this attribute is of type CFNumberRef.

  • kVTCompressionPropertyKey_DataRateLimits

    This property represents the bitrate limit in bytes. The property is a CFNumberRef or CFArrayRef type.

/ / whether the real-time encoding output VTSessionSetProperty (_encoderSession kVTCompressionPropertyKey_RealTime, kCFBooleanTrue); / / set the profile and level VTSessionSetProperty (_encoderSession kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel); / / if the B frame VTSessionSetProperty (_encoderSession kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse); Int frameInterval = 10; CFNumberRef frameIntervalRaf = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval); VTSessionSetProperty(_encoderSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRaf); // set expected FPS int FPS = 10; CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps); VTSessionSetProperty(_encoderSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef); Int bitRate = self.width * self.height * 3 * 4 * 8; CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate); VTSessionSetProperty(_encoderSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef); // set the hard bit rate limit int bigRateLimit = self.width * self.height * 3 * 4; CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRateLimit); VTSessionSetProperty(_encoderSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef);Copy the code

3. Prepare to code

OSStatus status = VTCompressionSessionPrepareToEncodeFrames(encodeSesion);
if(status ! = noErr) { NSLog(@"prepare to encode error! [status : %d]", (int)status);
}
Copy the code

5. Coding

We get CMSampleBuffer data can use VTCompressionSessionEncodeFrame () to encode.

CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
self.frameID++;
CMTime timeStamp = CMTimeMake(self.frameID, 1000);
CMTime duration = kCMTimeInvalid;
VTEncodeInfoFlags flag;
OSStatus status = VTCompressionSessionEncodeFrame(self.encoderSession, pixelBuffer, timeStamp, duration, NULL, NULL, &flag);
if(status ! = noErr) { NSLog(@"encode sample buffer error [status : %d]", status);
}
Copy the code

6. Post-coding

This part is performed in the callback function.

(1) judge the coding state

if(status ! = noErr) {return;
}
Copy the code

(2) Judge whether the data is ready

Boolean isDataReady = CMSampleBufferDataIsReady(sampleBuffer);
if(! isDataReady) {return;
}
Copy the code

(3). Get current object

This step depends on the reference values we passed in when we created the session

KKKVideoCoder *coder = (__bridge KKKVideoCoder *)(outputCallbackRefCon);
Copy the code

(4) Judge key frame (I frame)

CFArrayRef attachmentsArray = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true);
if(! attachmentsArray) {return;
}
CFDictionaryRef dict = CFArrayGetValueAtIndex(attachmentsArray, 0);
if(! dict) {return;
}
Boolean isIFrame = false; isIFrame = ! CFDictionaryContainsKey(dict, kCMSampleAttachmentKey_NotSync);Copy the code
  • aboutkCMSampleAttachmentKey_NotSyncThe official document has the following discussion:

A sync sample, also known as a key frame or IDR (Instantaneous Decoding Refresh), can be decoded without requiring any previous samples to have been decoded. Samples following a sync sample also do not Require samples prior to the sync sample to have been decoded. Samples are assumed to be sync samples by default — set the value for this key to kCFBooleanTrue for samples which should not be treated as sync samples. This attachment is read from and written to media files.

A simple translation is as follows:

A synchronous sample, a keyframe or IDR, can be decoded without decoding any of the previous samples. A sample after a synchronized sample does not require the sample before the synchronized sample to be decoded. By default, samples are assumed to be synchronous samples. – If the sample is not considered synchronous, this key is set to kCFBooleanTrue. The attachment is read and written from a media file.

(5). Obtain SPS and PPS from key frame (I frame)

If we get key frames (I frames), we need to splice the corresponding SPS and PPS before the data.

  • Obtaining description
CMFormatDescriptionRef formatDesc = CMSampleBufferGetFormatDescription(sampleBuffer);
Copy the code
  • Access to SPS
size_t spsSize, spsCount;
const uint8_t *spsData;
OSStatus spsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDesc, 0, &spsData, &spsSize, &spsCount, 0);
if (spsStatus == noErr) {
 
	coder.hasSPS = YES;
	NSMutableData *sps = [NSMutableData dataWithCapacity:4 + spsSize];
	[sps appendBytes:startCode length:4];
	[sps appendBytes:spsData length:spsSize];
	 
	dispatch_async(coder.callBackQueue, ^{
		[coder.delegate encoderGetSPSData:sps];
	});
} else {
	NSLog(@"get SPS error! [status : %d]", spsStatus);
}
Copy the code
  • Obtain PPS
size_t ppsSize, ppsCount;
const uint8_t *ppsData;
OSStatus ppsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDesc, 1, &ppsData, &ppsSize, &ppsCount, 0);
if(ppsStatus == noErr) { coder.hasPPS = YES; NSMutableData *pps = [NSMutableData dataWithCapacity:4 + ppsSize]; [pps appendBytes:startCode length:4]; / / startCode is"\x00\x00\x00\x01"
    [pps appendBytes:ppsData length:ppsSize];
      
    dispatch_async(coder.callBackQueue, ^{
        [coder.delegate encoderGetPPSData:pps];
    });
} else {
    NSLog(@"get PPS error! [status : %d]", ppsStatus);
}
Copy the code

(6). Processed encoded data

  • Get the encoded data
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    
size_t lengthAtOffsetOut, totalLengthOut;
char * dataPointOut;
OSStatus error = CMBlockBufferGetDataPointer(blockBuffer, 0, &lengthAtOffsetOut, &totalLengthOut, &dataPointOut);
if(error ! = kCMBlockBufferNoErr) { NSLog(@"get block buffer data pointer failed! [status : %d]", error);
}
Copy the code
  • Cycle fromdataBufferGet NALU stream data

Note: The first 4 bytes of returned NALU data are not the starting position, but the frame length in big-endian modelength

size_t offset = 0;
const int startCodeLength = 4;
whileChar * SRC = dataPointOut + offset; (offset < totalLengthOut - startCodeLength) { Uint32_t naluBigLength = 0; memcpy(&naluBigLength, src, startCodeLength); Uint32_t naluHostLength = CFSwapInt32BigToHost(naluBigLength); Uint32_t naluLength = startCodeLength + naluHostLength; / / data splicing NSMutableData * data = [NSMutableData dataWithCapacity: naluLength]; [data appendBytes:startCode length:4]; [data appendBytes:src + startCodeLength length:naluHostLength]; dispatch_async(coder.callBackQueue, ^{ [coder.delegate encoderGetData:data]; }); offset += naluLength; }Copy the code

7. End of coding

We can end the encoding operation in the dealloc method

- (void)dealloc {
    
    if(self.encoderSession) { VTCompressionSessionCompleteFrames(self.encoderSession, kCMTimeInvalid); VTCompressionSessionInvalidate(self.encoderSession); CFRelease(self.encoderSession); self.encoderSession = NULL; }}Copy the code