With the basic concepts and the H264 structure and code stream parsing foundation, this article starts to write code, the previous according to the AVFoundation framework to do the acquisition workflow will not write, directly from the acquisition proxy method **captureOutput: Start encoding video frames in didOutputSampleBuffer: fromConnection:**.

Prepare the encoder, creating the session: VTCompressionSessionCreate, and set the encoder properties;
Start coding: VTCompressionSessionEncodeFrame
Processing data in the encoded callback: add start code **”\x00\x00\x00\x01″, add SPS PPS **, etc.
End encoding, clear data, release resources.

Prepare encoder

Create a session: VTCompressionSessionCreate
Set properties: VTSessionSetProperty Whether to encode output in real time, whether to generate B frames, set key frames, set expected frame rate, set bit rate, maximum bit rate value, etc
Ready to start coding: VTCompressionSessionPrepareToEncodeFrames

-(void)initVideoToolBox {// cEncodeQueue dispatch_sync(cEncodeQueue, ^{frameID = 0; int width = 480,height = 640; / / create a coding session OSStatus status = VTCompressionSessionCreate (NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &cEncodeingSession); NSLog(@"H264:VTCompressionSessionCreate:%d",(int)status);
        
        if(status ! = 0) { NSLog(@"H264:Unable to create a H264 session");
            return; } / / set the real-time encoding output (avoid delay) VTSessionSetProperty (cEncodeingSession kVTCompressionPropertyKey_RealTime, kCFBooleanTrue); VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ProfileLevel,kVTProfileLevel_H264_Baseline_AutoLevel);  VTSessionSetProperty(cEncodeingSession, cEncodeingSession); kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse); Int frameInterval = 10; // Set the GOPsize interval. CFNumberRef frameIntervalRaf = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval); VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRaf); // Set expected framerate, not actual framerate int FPS = 10; CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps); VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef); // Bit rate: a large bit rate will be very clear, but at the same time the file will be large. Int bitRate = width * height * 3 * 4 * 8; int bitRate = width * height * 3 * 4 * 8; CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate); VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef); Byte int bigRateLimit = width * height * 3 * 4; CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRateLimit); VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef); / / ready to start coding VTCompressionSessionPrepareToEncodeFrames (cEncodeingSession); }); }Copy the code

A: VTCompressionSessionCreate create coding object parameters

Allocator: NULL allocator. NULL is set to the default allocation
Width: the width
Height: height
CodecType: indicates the encoding type, such as kCMVideoCodecType_H264
EncoderSpecification: NULL encoderSpecification: encoding specification. Setting NULL is optional for videoToolbox
SourceImageBufferAttributes: NULL sourceImageBufferAttributes: source pixel buffer properties. Set NULL to disallow videToolbox creation and create your own
CompressedDataAllocator: compressedDataAllocator. Set NULL, the default assignment
OutputCallback: coding callback, when VTCompressionSessionEncodeFrame is called compression after the first will be an asynchronous call. The function name set here is didCompressH264
OutputCallbackRefCon: Callback the reference value defined by the customer, passing self here, because we need to call self’s method in the C function, and C functions cannot call self directly
CompressionSessionOut: Encodes session variables

Start coding

Get unencoded video frame: CVImageBufferRef imageBuffer = (CVImageBufferRef) CMSampleBufferGetImageBuffer (sampleBuffer);
CMTime presentationTimeStamp = CMTimeMake(Frame ++, 1000);
Start coding: call VTCompressionSessionEncodeFrame coding

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer FromConnection :(AVCaptureConnection *)connection {// start video recording, get the camera video frame, Dispatch_sync (cEncodeQueue, ^{[self encode:sampleBuffer]; }); }Copy the code

- (void) encode:(CMSampleBufferRef)sampleBuffer {CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer); CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000); / / start coding OSStatus statusCode = VTCompressionSessionEncodeFrame (cEncodeingSession imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, &flags);if(statusCode ! = noErr) {// Encoding failed NSLog(@"H.264:VTCompressionSessionEncodeFrame faild with %d",(int)statusCode); / / release resources VTCompressionSessionInvalidate (cEncodeingSession); CFRelease(cEncodeingSession); cEncodeingSession = NULL;return; }}Copy the code

A: VTCompressionSessionEncodeFrame coding function parameters

Session: Encodes session variables
ImageBuffer: Unencoded data
PresentationTimeStamp: the presentationTimeStamp of the sample buffer data obtained. Each timestamp passed to this session is greater than the previous display timestamp
Duration: The display time of the frame when the sample buffer data is obtained. If there is no time information, set kCMTimeInvalid.
FrameProperties: Contains the properties of this frame. Frame changes affect subsequent encoded frames.
SourceFrameRefcon: The callback will refer to the frame reference that you set.
InfoFlagsOut: points to a VTEncodeInfoFlags to accept an encoding operation. If running asynchronously,kVTEncodeInfo_Asynchronous is set; Run synchronously,kVTEncodeInfo_FrameDropped is set; Set NULL to do not want to accept this message.

Data processing after coding

Identifying the key frames: so, CMVideoFormatDescriptionGetH264ParameterSetAtIndex for information on the SPS and PPS, and converted to binary or written to the file to upload
Assemble NALU data: Obtain encoded H264 stream data: CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer (sampleBuffer), By the first address, a single length, total length by dataPointer pointer offset do traverse OSStatus statusCodeRet = CMBlockBufferGetDataPointer (dataBuffer, 0, & length, &totalLength, &dataPointer); There is a big-endian mode for reading data: network traffic is generally big-endian

/ * 1. H264 hard-coded, after the completion of the callback VTCompressionOutputCallback 2. Convert the hard-coded CMSampleBuffer into H264 code stream and propagate it through the network 3. The parameter set SPS & PPS is parsed and the start code is added to assemble NALU. Withdraw the video data, convert the length code to the start code, form the NALU, and send the NALU. */ void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)
{
    NSLog(@"didCompressH264 called with status %d infoFlags %d",(int)status,(int)infoFlags); // Status errorif(status ! = 0) {return; } // Not readyif(! CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@"didCompressH264 data is not ready");
        return; } ViewController *encoder = (__bridge ViewController *)outputCallbackRefCon; / / determine whether the current frame for key frames CFArrayRef array = CMSampleBufferGetSampleAttachmentsArray (sampleBuffer,true); CFDictionaryRef dic = CFArrayGetValueAtIndex(array, 0); bool keyFrame = ! CFDictionaryContainsKey(dic, kCMSampleAttachmentKey_NotSync); // SPS (sample per second /s) is a unit of measurement of the sampling rate during ADC // PPS ()if(keyFrame) {/ / image storage way, the description of encoder, the CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription (sampleBuffer); //sps size_t sparameterSetSize,sparameterSetCount; const uint8_t *sparameterSet; OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0);if(statusCode = = noErr) {/ / to get PPS size_t pparameterSetSize, pparameterSetCount; const uint8_t *pparameterSet; / / SPS & PPS was obtained from the first keyframe OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex (format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0); // Get the SPS and PPS in H264 parameter setif (statusCode == noErr)
            {
                NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
                NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
                
                if(encoder)
                {
                    [encoder gotSpsPps:sps pps:pps];
                }
            }
        }
    }
    
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length,totalLength;
    char *dataPointer;
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if(statusCodeRet == noErr) { size_t bufferOffset = 0; static const int AVCCHeaderLength = 4; // The first 4 bytes of nALU data returned are not 001 startCode, but the frame length of large-ended modewhile(bufferOffset < totalLength - AVCCHeaderLength) { uint32_t NALUnitLength = 0; // Read nalu memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength); NALUnitLength = CFSwapInt32BigToHost(NALUnitLength); NSData *data = [[NSData alloc]initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength]; // Write nALu data to a file [encoder gotEncodedData:data isKeyFrame:keyFrame]; //move to the next NAL unitinBufferOffset += AVCCHeaderLength + NALUnitLength; bufferOffset += AVCCHeaderLength + NALUnitLength; GotSpsPps :(NSData*) PPS :(NSData*) PPS {const char bytes[] ="\x00\x00\x00\x01"; size_t length = (sizeof bytes) - 1; // The last bit is the \0 terminator NSData *ByteHeader = [NSData dataWithBytes:bytes length:length]; [fileHandele writeData:ByteHeader]; [fileHandele writeData:sps]; [fileHandele writeData:ByteHeader]; [fileHandele writeData:pps]; } - (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame {if(fileHandele ! < span style = "max-width: 100%; clear: both; min-height: 1em; The current NAL ends. const char bytes[] ="\x00\x00\x00\x01"; Size_t length = (sizeof bytes) -1; NSData *ByteHeader = [NSData dataWithBytes:bytes length:length]; // Write the header byte [fileHandele writeData:ByteHeader]; // Write H264 data [fileHandele writeData:data]; }}Copy the code

The end of the coding

-(void)endVideoToolBox
{
    VTCompressionSessionCompleteFrames(cEncodeingSession, kCMTimeInvalid);
    VTCompressionSessionInvalidate(cEncodeingSession);
    CFRelease(cEncodeingSession);
    cEncodeingSession = NULL;  
}
Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Audio and video learning – H264 coding

Prepare encoder

Start coding

Data processing after coding

The end of the coding

Audio and video learning – H264 coding

Prepare encoder

Start coding

Data processing after coding

The end of the coding

Related Posts

KVO explores and customizes KVO implementations

Trust issues with third-party SDKS

IOS SDWebImage learning