The most simple iOS push stream code, video capture, soft coding (FAAC, X264), hard coding (AAC, H264), beauty, FLV coding, RTMP protocol, updated code parsing, you want to learn the knowledge here, willing to understand live technology students come to see!!

Source: https://github.com/hardman/AWLive

The previous section described how to obtain audio and video data in real time through the camera.

The next thing we need to know is what the data actually looks like.

The audio and video data obtained using the interface provided by the system are stored in CMSampleBufferRef.

The audio data obtained with GPUImamge is CMSampleBufferRef, and the video data obtained is binary data in BGRA format.

CMSampleBufferRef introduction

This structure represents a frame of audio/video data in iOS.

It contains the content and format of this frame of data.

We can take its contents out and extract/convert them into the data we want.

On behalf of video of data is kept in a CMSampleBufferRef yuv420 format video frames (because we are in the video output format output Settings will be set to: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange).

The data stored in the CMSampleBufferRef that represents audio is an audio frame in PCM format.

What is YUV? What is NV12?

Video is made up of frames of data, and a frame of video data is actually a picture.

Yuv is an image storage format, similar to RGB.

RGB images are easy to understand. Most of the images in a computer are stored in RGB.

In YUV, y stands for brightness, and only y data can form a picture, but this picture is gray. U and V represent color differences (u and V are also known as Cb- blue difference, Cr- red difference),

Why yuV?

For some historical reasons, the earliest TV signals, in order to be compatible with black and white TV, used yuV format.

If you take a picture of yuV, remove the UV and keep the Y, it’s black and white.

Moreover, YUV can optimize bandwidth by abandoning chromatic aberration.

For example, yuV420 format image compared to RGB, to save half of the byte size, discard adjacent color difference to the human eye, there is little difference.

(width * height + (width * height) / 4 + (width * height) / 4) = (width * height) * 3/2 The number of bytes occupied is (width * height) * 3

In transmission, yuV format video is also more flexible (YUV3 data can be transmitted separately).

Many video encoders initially do not support RGB. But all video encoders support yuV format.

Generally speaking, we choose to use YUV format, so we first convert the video data to YUV format before coding.

The video we’re using here is in YUV420 format.

Yuv420 also includes different data arrangement formats: I420, NV12, NV21.

The formats are as follows: I420 format: Y, U, V three parts are stored respectively: Y0,Y1… Yn,U0,U1… Un/2,V0,V1… Vn/2 NV12 format: Y and UV two parts respectively: Y0,Y1… Yn,U0,V0,U1,V1… Un/2,Vn/2 NV21 format: same as NV12, except that U and V are in reverse order.

Taken together, these formats make no difference to display except for the order in which they are stored.

Which video format to use depends on the video output format you set when initializing the camera. When set to kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, said the output video format for NV12; If kCVPixelFormatType_420YpCbCr8Planar is selected, I420 is used.

NV12 is used when GPUImage sets the camera’s output data.

For consistency, we also choose NV12 format for video output here.

What is PCM?

Pulse-coded modulation, in effect, converts irregular analog signals into digital ones, which can then be stored through physical media.

Sound is also an analog signal at a specific frequency (20-20,000 Hz), which can also be converted into digital signals through this technology and thus preserved.

The PCM format is the original sound data format saved when recording sound.

You’ve probably heard of audio in WAV format, which is basically waV format by adding a header to a PCM data stream.

Wav is sometimes called lossless because it stores raw PCM data (also related to sample rate and bit rate).

Audio formats like MP3, AAC, etc., are lossy compression. In order to save space, maximum compression is performed with minimal loss of sound.

All audio encoders support PCM encoding, and the recorded sound is in PCM format by default, so the next step is to get the recorded PCM data.

Extract yuV data from CMSampleBufferRef

Capture video in the previous article (using the system interface), initialize the output device, we set the output data to kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange. So the data stored in CMSampleBufferRef is in YUV420 (NV12) format. Convert CMSampleBufferRef to YUV420 (NV12) using the following method.

/ / AWVideoEncoder. M file - (NSData *) convertVideoSmapleBufferToYuvData: CMSampleBufferRef videoSample {/ / / / access to yuv data Get CVImageBufferRef by CMSampleBufferGetImageBuffer method. / / the inside contains yuv420 (NV12) data pointer CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer (videoSample); / / start operating data CVPixelBufferLockBaseAddress (pixelBuffer, 0); // Size_t pixelWidth = CVPixelBufferGetWidth; // Size_t pixelHeight = CVPixelBufferGetHeight(pixelBuffer); Size_t y_size = pixelWidth * pixelHeight; Size_t uv_size = y_size / 2; uint8_t *yuv_frame = aw_alloc(uv_size + y_size); / / retrieve uint8_t * y_frame = y data from CVImageBufferRef CVPixelBufferGetBaseAddressOfPlane (pixelBuffer, 0); memcpy(yuv_frame, y_frame, y_size); / / retrieve the uv data from CMVImageBufferRef uint8_t * uv_frame = CVPixelBufferGetBaseAddressOfPlane (pixelBuffer, 1); memcpy(yuv_frame + y_size, uv_frame, uv_size); CVPixelBufferUnlockBaseAddress(pixelBuffer, 0); // Return datareturn [NSData dataWithBytesNoCopy:yuv_frame length:y_size + uv_size];
}
Copy the code

Convert BGRA images obtained by GPUImage to YUV (NV12) format

// awgpuimagcapture. M file -(void)newFrameReadyAtTime:(CMTime)frameTime atIndex:(NSInteger)textureIndex{[super newFrameReadyAtTime:frameTime atIndex:textureIndex];if(! self.capture || ! self.capture.isCapturing){return; Int width = imagesize.width; int width = imagesize.width; Int height = imagesize.height; Int w_x_h = width * height; Int yuv_len = w_x_h * 3/2; // Uint8_t *yuv_bytes = malloc(yuv_len); //ARGBToNV12 is a function provided by libyuv to convert bGRa images to YUv420 format. Libyuv is a high performance image transcoding operation provided by Google. Support a large number of efficient operations on pictures, video streaming is an indispensable component, you deserve it. [self lockFramebufferForReading]; ARGBToNV12(self.rawBytesForImage, width * 4, yuv_bytes, width, yuv_bytes + w_x_h, width, width, height); [self unlockFramebufferAfterReading]; NSData *yuvData = [NSData dataWithBytesNoCopy:yuv_bytes length:yuv_len]; [self.capture sendVideoYuvData:yuvData]; }Copy the code

Extract PCM data from CMSampleBufferRef

/ / AWAudioEncoder. M file - (NSData *) convertAudioSmapleBufferToPcmData: CMSampleBufferRef audioSample {/ / obtain PCM data size NSInteger audioDataSize = CMSampleBufferGetTotalSampleSize(audioSample); Int8_t *audio_data = aw_alloc((int32_t)audioDataSize); / / get CMBlockBufferRef / / in this structure preserved the PCM data CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer (audioSample); / / the data copy directly to our own memory allocation CMBlockBufferCopyDataBytes (dataBuffer, 0, audioDataSize audio_data); // Return datareturn [NSData dataWithBytesNoCopy:audio_data length:audioDataSize];
}
Copy the code

So far we have converted the captured video data to YUV420 format and the audio data to PCM format.

This data can then be encoded in various ways. Once the encoding is complete, the data can be sent to the server.

The article lists

  1. 1 hour to learn: the simplest iOS live push stream (A) project introduction
  2. 1 hour learning: The simplest iOS Live Streams (II) Code Architecture Overview
  3. 1 hour to learn: the simplest iOS live push stream (3) Using the system interface capture audio and video
  4. 1 hour learning: the simplest iOS live push stream (4) how to use GPUImage, how to beauty
  5. 1 hour learning: the simplest iOS live stream push (5) yuV, PCM data introduction and acquisition
  6. 1 hour to learn: the simplest iOS live push stream (6) H264, AAC, FLV introduction
  7. Learn in 1 hour: The simplest iOS Live Stream push (7) H264 / AAC hard coding
  8. 1 hour to learn: The simplest iOS live Push stream (8) H264 / AAC soft coding
  9. Learn in 1 hour: The simplest iOS live stream push (nine) FLV encoding with audio and video timestamp synchronization
  10. 1 hour learning: The simplest iOS live push stream (10) Librtmp usage introduction
  11. Introduction to SPS&PPS and AudioSpecificConfig (End)