Android MediaCodec hardcoded H264 files

Android 4.1 provides MediaCodec interface to access device codecs. Different from FFmpeg’s software codecs, it uses hardware codecs, so it has more advantages in speed than soft solution. However, due to the fragmentation of Android, there are many models. Due to the different versions, MediaCodec needs to spend a lot of effort to adapt to the model compatibility. Besides, the codec process is not controllable, and all the codec is implemented by the underlying hardware of the manufacturer. The final video quality may not be ideal.

Although MediaCodec still has some disadvantages, it is a good reference for fast implementation of codec requirements.

The use of MediaCodec is illustrated by encoding YUV data from a camera preview into an H264 video stream.

Using analytical

MediaCodec working model

The following figure shows how MediaCodec works, a typical producer-consumer model, with clients representing input and output. The input sends data to MediaCodec for encoding or decoding, and the output gets the encoded or decoded content.

The input and output sides pass data to MediaCodec in the form of an input queue buffer and an output queue buffer.

A usable buffer is enqueued from the input queue, filled with data, and enqueued to MediaCodec for processing.

After MediaCodec has finished processing, a usable buffer is enqueued from the output queue. The data in the buffer is the encoded or decoded data. After processing the data, the buffer needs to be released and returned to the queue for future use.

MediaCodec life cycle

MediaCodec also has a life cycle, as shown in the following figure:

You are in the Uninitialized state when MediaCodec is created. You are in the Configured state when configure is called. You are in the Executing state when start is called.

Executing data is a substate and has three substates: Executing data in Executing state

Flushed
Running
End of Stream

When the uploading buffer is Flushed after the start method is called, it enters the Flushed state. After fetching a buffer from the uploading buffer queue, it enters the Running state. When the uploaded buffer bears the EOS flag, it will switch to the End of Stream state. MediaCodec no longer accepts enqueued buffers, but still works on and outputs enqueued buffers that have not been codec until the output buffer bears an EOS flag indicating that the codec operation is complete.

You can flush MediaCodec to the Flushed state if you are in the Executing state.

You can call the stop method to make MediaCodec switch to the Uninitialized state and then call the configure method again to enter the Configured state. In addition, when the reset method is called, the state is Uninitialized.

When MediaCodec is no longer needed, the release method is called to release it and release it.

When MediaCodec works abnormally, it enters the Error state. In this case, you can use the reset method to recover and enter the Uninitialized state.

MediaCodec invokes the process

Once you understand MediaCodec’s life cycle and workflow, you’re ready to start coding.

Using MediaCodec synchronous calls as an example, the procedure is as follows:

 // Create MediaCodec in Uninitialized state
 MediaCodec codec = MediaCodec.createByCodecName(name);
 // Call configure to enter the Configured stateThe codec. The configure (format,...). ; MediaFormat outputFormat = codec.getOutputFormat();// option B
 // Call start to start codec
 codec.start();
 for (;;) {
   // Fetch the available buffer from the input buffer queue and populate it with data
   int inputBufferId = codec.dequeueInputBuffer(timeoutUs);
   if (inputBufferId >= 0) {ByteBuffer inputBuffer = codec.getinputBuffer (...) ;// fill inputBuffer with valid data... Codec. QueueInputBuffer (inputBufferId,...). ; }// Get the codec content from the output buffer queue, perform corresponding operations and release it for next use
   intOutputBufferId = codec. DequeueOutputBuffer (...). ;if (outputBufferId >= 0) {
     ByteBuffer outputBuffer = codec.getOutputBuffer(outputBufferId);
     MediaFormat bufferFormat = codec.getOutputFormat(outputBufferId); // option A
     // bufferFormat is identical to outputFormat
     // outputBuffer is ready to be processed or rendered.... Codec. ReleaseOutputBuffer (outputBufferId,...). ; }else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
     // Subsequent data will conform to new format.
     // Can ignore if using getOutputFormat(outputBufferId)
     outputFormat = codec.getOutputFormat(); // option B}}// Call the stop method to enter the Uninitialized state
 codec.stop();
 // Call the release method to end the operation
 codec.release();
Copy the code

Code parsing

MediaFormat set

You need to create and set the MediaFormat object, which represents information about the media data format. For the video, there are mainly the following information to set:

Color format
Bit rate
Bit rate control mode
Frame rate
The I frame interval

Among them, the bit rate refers to the number of bits of data transmitted per unit of transmission time, generally expressed in KBPS, namely thousands of bits per second. Frame rate is the number of frames per second.

There are actually three modes to control the bit rate:

BITRATE_MODE_CQ
- It does not control the bit rate and ensures the image quality as best as possible
BITRATE_MODE_VBR
- Indicates that MediaCodec dynamically adjusts the output bit rate based on the complexity of the image content, with high bit rate for a responsible image and low bit rate for a simple image
BITRATE_MODE_CBR
- Indicates that MediaCodec will control the output bit rate to a set size

For color format, YUV data is encoded into H264, and there are many YUV formats, which involves the compatibility of models. In the camera encoding to do a good job of format processing, such as camera using NV21 format, MediaFormat using COLOR_FormatYUV420SemiPlanar, also known as NV12 mode, then you have to do a conversion, convert NV21 to NV12.

For i-frame interval, that is, how often an H264 i-frame occurs.

Examples of complete MediaFormat Settings:

        MediaFormat mediaFormat = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, width, height);
        mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420SemiPlanar);
        / / ma ratio
        mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, width * height * 5);
        // Adjust the bit rate flow control mode
        mediaFormat.setInteger(MediaFormat.KEY_BITRATE_MODE, MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR);
        // Set the frame rate
        mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
        // Set the I frame interval
        mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 1);
Copy the code

When the codec operation begins, the codec thread is opened to process the YUV data returned by the camera preview.

A wrapper library for the camera is used here:

Github.com/glumes/EzCa…

Codec operation

Codec operation codes are as follows:

while (isEncoding) {
    // YUV color format conversion
    if(! mEncodeDataQueue.isEmpty()) { input = mEncodeDataQueue.poll();byte[] yuv420sp = new byte[mWidth * mHeight * 3 / 2];
        NV21ToNV12(input, yuv420sp, mWidth, mHeight);
        input = yuv420sp;
    }
    if(input ! =null) {
        try {
            // Get the available buffer from the input buffer queue, fill in the data, and join the queue
            ByteBuffer[] inputBuffers = mMediaCodec.getInputBuffers();
            ByteBuffer[] outputBuffers = mMediaCodec.getOutputBuffers();
            int inputBufferIndex = mMediaCodec.dequeueInputBuffer(-1);
            if (inputBufferIndex >= 0) {
                // Compute the timestamp
                pts = computePresentationTime(generateIndex);
                ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
                inputBuffer.clear();
                inputBuffer.put(input);
                mMediaCodec.queueInputBuffer(inputBufferIndex, 0, input.length, pts, 0);
                generateIndex += 1;
            }
            MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
            int outputBufferIndex = mMediaCodec.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC);
            // Get the encoded content from the output buffer queue, process the content accordingly before releasing it
            while (outputBufferIndex >= 0) {
                ByteBuffer outputBuffer = outputBuffers[outputBufferIndex];
                byte[] outData = new byte[bufferInfo.size];
                outputBuffer.get(outData);
                // flags uses bitwise operations to define flags as multiples of 2
                if((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) ! =0) { // Configure the relevant content, i.e. SPS, PPS
                    mOutputStream.write(outData, 0, outData.length);
                } else if((bufferInfo.flags & MediaCodec.BUFFER_FLAG_KEY_FRAME) ! =0) { / / key frames
					mOutputStream.write(outData, 0, outData.length);
                } else {
                    // Non-key frame and SPS, PPS, directly write file, may be B frame or P frame
                    mOutputStream.write(outData, 0, outData.length);
                }
                mMediaCodec.releaseOutputBuffer(outputBufferIndex, false); outputBufferIndex = mMediaCodec.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC); }}catch(IOException e) { Log.e(TAG, e.getMessage()); }}else {
        try {
            Thread.sleep(500);
        } catch(InterruptedException e) { Log.e(TAG, e.getMessage()); }}}Copy the code

First, the camera’s NV21 format is converted to NV12 format, and then the buffer is queued from the available input buffer queue using the dequeueInputBuffer method. After filling the data, the queue is queued using the queueInputBuffer method.

DequeueInputBuffer returns the buffer index. If the index is less than 0, no buffer is currently available. The parameter timeoutUs represents the timeout. After all, MediaCodec’s synchronization mode blocks the specified parameter time if there is no buffer available, and if the parameter is negative, it blocks forever.

The queueInputBuffer method enlists data, in addition to passing the index value at the time of the queue, it also passes in a timestamp presentationTimeUs for the current buffer and an identifier flag for the current buffer.

Where the timestamp is usually the time when the buffer was rendered, and the identifier has multiple identifiers that identify the type of the current buffer:

BUFFER_FLAG_CODEC_CONFIG
- Identifies that the current buffer carries codec initialization information, not media data
BUFFER_FLAG_END_OF_STREAM
- End flag, current buffer is last, end of stream
BUFFER_FLAG_KEY_FRAME
- Indicates that the current buffer is key frame information, that is, I frame information

You can calculate the timestamp of the current buffer during encoding, or you can pass 0 as an argument to the identity.

Once the data is passed to MediaCodec, the dequeueOutputBuffer method is used to retrieve the codec data. In addition to specifying a timeout, the mediacodec.bufferinfo object is passed. This object contains the length, offset, and identifier of the encoded data.

After extracting the data from Mediacodec.bufferInfo, do different things with different identifiers:

BUFFER_FLAG_CODEC_CONFIG
- Represents the current data is some configuration data, in H264 coding is SPS and PPS data, i.e00 00 00 01 67 和 00 00 00 01 68The data at the beginning, which is a must, contains the width and height information of the video.
BUFFER_FLAG_KEY_FRAME
- Keyframe data, for frame I data, which starts with00 00 00 01 65The data,
BUFFER_FLAG_END_OF_STREAM
- Indicates the end of MediaCodec

Flags returned that do not match the predefined flags can be written directly, which may represent EITHER P or B frames in H264.

For codec data, release the corresponding buffer through releaseOutputBuffer method after operation, where the second parameter render indicates whether to render to surface, which is not needed for the time being false.

Stop coding

When you want to stop coding, switch to the Uninitialized state through MediaCodec’s stop method and then call release to release it.

Instead of stopping encoding with the BUFFER_FLAG_END_OF_STREAM identifier, the state is switched and used for Surface recording.

This is the end of the MediaCodec hardcoded and parsed camera content into H264 files, mainly about the use of MediaCodec, once familiar with the use, complete coding work is very easy.

reference

Android MediaCodec parsing
Talk about the pitfalls of Android video coding

Technical exchange, welcome to add wechat friends ~~

Welcome to pay attention to wechat public number [paper talk], see more audio and video, OpenGL, multimedia development articles