Android Audio and Video - MediaCodec encoded MP4 trampling record

Welcome to follow the official wechat account: FSA Full stack action 👋

The project needs to drive the camera to acquire YUV images on low-end Android devices, and at the same time, it also needs to record videos. The acquisition and processing of YUV images have already gone through the past. The general feeling is that as long as I master the principles of camera and YUV and other knowledge points, there is basically nothing to combine with libyuv, an awesome library. And video this piece is to use MediaCodec + MediaMuxer to deal with, this is my use of native MediaCodec encoding MP4 files to step on the notes, there are two main problems:

Video wrong color
Play Too Fast

Note: How bad are low-end Android devices? The CPU speed is impressive, and hard coding is the only way.

First, video discoloration

Before we explore this question, let’s take a look at MediaCodec’s two encoding modes:

ByteBuffer mode(Manual) :
- Format:COLOR_FORMATThe corresponding value is thetaMediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420SemiPlanar(Image format NV21).
- Operation: PassMediaCodec.dequeueInputBuffer()Get the data input buffer and pass throughMediaCodec.queueInputBuffer()Manually pass the YUV image toMediaCodec.
Surface model(Automatic) :
- Format:COLOR_FORMATThe corresponding value is thetaMediaCodecInfo.CodecCapabilities.COLOR_FormatSurface.
- Operation: PassMediaCodec.createInputSurface()Create the coded data source Surface, and then draw the camera preview image to the Surface through OpenGL texture.

1, the phenomenon of

The camera previews are normal, but the resulting MP4 video is very colorful.

Description: As YUV images inverted u/ V after the same effect.

2, analysis,

In ByteBuffer mode, the original NV21 image is obtained from the camera and sent to MediaCodec with COLOR_FORMAT set to COLOR_FormatYUV420SemiPlanar. The result is displayed on different Android devices. Some are normal, some are not normal (a few), at first thought that some devices do not support this COLOR_FORMAT, but this is not the case. Here’s how stackOverflow explained the problem:

The YUV formats used by Camera output and MediaCodec input have their U/V planes swapped.

If you are able to move the data through a Surface you can avoid this issue; however, you lose the ability to examine the YUV data. An example of recording to a .mp4 file from a camera can be found on bigflake.

Some details about the colorspaces and how to swap them is in this answer.

...
Copy the code

Note: stackoverflow article links: stackoverflow.com/questions/1…

Therefore, this is a bug of MediaCodec itself, which will exchange the U/V of the input YUV image. There are two solutions:

useByteBuffer modeTo send the NV21 image toMediaCodecPreviously, the NV21 was converted to NV12 (after all, the two goods are only u/ V opposite), but as mentioned before, only a few devices will have this situation, it is estimated that there is enough to adapt.Is not recommended
useSurface model, can perfectly avoid this situation, but at the same time will lose the original YUV image processing ability, but can use OpenGL to process the image.recommended

3, implementation,

The general steps are as follows:

On the one hand, create textures using OpenGL textures and wrap them as SurfaceTexture for the camera as a Preview window so that the camera image is rendered on the texture.
On the other hand, usemMediaCodec.createInputSurface()As the encoded data source for MediaCodec.
Finally, while the camera is previewing, draw the image on the texture toinputSurface.

Camera –> TextureId(OpenGL) –> InputSurface(MediaCodec)

Specific implementation in bigflake Demo (CameraToMpegTest) extract: www.bigflake.com/mediacodec/…

Ii. Video length shrinkage (frame loss)

There are two keys to solving this problem:

Timestamp alignment:
- ByteBuffer modeThrough:MediaCodec.queueInputBuffer()Manually pass the YUV image toMediaCodecAt the same time, you need to pass the current timestamp, noting that the time is in microseconds (US).
- Surface modelThrough:MediaCodec.createInputSurface()You’re going to create an inputSurface, and you’re going to compare it to thatMEGLDisplay, mEGLSurface, in theEGL14.eglSwapBuffers(mEGLDisplay, mEGLSurface)Before, throughEGLExt.eglPresentationTimeANDROID(mEGLDisplay, mEGLSurface, nsecs)Set a timestamp to MediaCodec’s inputSurface data.
Media Format configuration:
- MediaFormatThe keyframe interval (KEY_I_FRAME_INTERVAL) and frame rate (KEY_FRAME_RATE) must be configured properly.

Note: here is the core summary, can jump past first look, then look back, will be better understood.

1, the phenomenon of

Record a video of 10 seconds, extract from the device, use the player to play and observe. It is found that some devices are normal, but the length of the video recorded by some devices is only half, which is also the play Too Fast problem that is commonly said on the Internet.

Onlystopwatch_x64.exe is a timer widget that is useful for video recording, livestream, and other situations where you need to watch the time. It is easy to see the problems such as missing pictures and playing too fast.

2, analysis,

In the stackOverflow q&A above, he also explained the problem with MediaCodec recording videos that play too fast:

. There is no timestamp information in the raw H.264 elementary stream. You need to pass the timestamp through the decoder  to MediaMuxer or whatever you're using to create your final output. If you don't, the player will just pick a rate, or possibly play the frames as fast as it can.Copy the code

Note: stackoverflow article links: stackoverflow.com/questions/1…

He argues that H.264 does not contain timestamp information, and that you need to pass the timestamp through the encoder (MediaCodec) to the media multiplexer (MediaMuxer), otherwise the player will select a rate and play the frame as quickly as possible.

3, implementation,

If ByteBuffer mode is used, the core code implementation is as follows:

private void feedMediaCodecData(byte[] data) {
    if(! isEncoderStart)return;
    int bufferIndex = -1;
    try {
        bufferIndex = mMediaCodec.dequeueInputBuffer(0);
    } catch (IllegalStateException e) {
        e.printStackTrace();
    }
    if (bufferIndex >= 0) {
        ByteBuffer buffer = null;
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
            try {
                buffer = mMediaCodec.getInputBuffer(bufferIndex);
            } catch(Exception e) { e.printStackTrace(); }}else {
            if(inputBuffers ! =null) { buffer = inputBuffers[bufferIndex]; }}if(buffer ! =null) {
            buffer.clear();
            buffer.put(data);
            buffer.clear();
            // convert to microseconds (us)
            mMediaCodec.queueInputBuffer(bufferIndex, 0, data.length, System.nanoTime() / 1000, MediaCodec.BUFFER_FLAG_KEY_FRAME); }}}Copy the code

At this time to note that MediaCodec microseconds (us) is the unit of time required to, if you don’t use the right time, may be a problem, such as: stackoverflow.com/questions/2…

Add: seconds (s), milliseconds (ms), microseconds (US), and nanoseconds (ns), all with ratios of 1:1000.

In Surface mode, the core code is as follows:

// Update the texture image
// Acquire a new frame of input, and render it to the Surface. If we had a GLSurfaceView we could switch EGL contexts and call drawImage() a second time to render it on screen. The texture can be shared between contexts by passing the GLSurfaceView's EGLContext as eglCreateContext()'s share_context argument.
mSurfaceTexture.updateTexImage();
mSurfaceTexture.getTransformMatrix(mSTMatrix);

// Pass in timestamp information
// Set the presentation time stamp from the SurfaceTexture's time stamp. This will be used by MediaMuxer to set the PTS in the video.
mInputSurface.setPresentationTime(mSurfaceTexture.getTimestamp());
// Submit it to the encoder. The eglSwapBuffers call will block if the input is full, which would be bad if it stayed full until we dequeued an output buffer (which we can't do, since we're stuck here). So long as we fully drain the encoder before supplying additional input, the system guarantees that we can supply another frame without blocking.
mInputSurface.swapBuffers();
Copy the code

Specific implementation in bigflake Demo (CameraToMpegTest) extract: www.bigflake.com/mediacodec/…

4, repair

Even though the time stamp was correctly passed to the MediaMuxer following the steps above, it still didn’t help. After comparing the code in my project with the code in Bigflake’s CameraToMpegTest, I found that the configuration of MediaFormat is also critical. It must be properly configured, otherwise the length will still shrink, so I modified the code in the original project. Change the frame rate to 30F and the key frame interval to 5s to solve the problem of frame loss. The specific code of MediaFormat configuration is as follows:

protected static final String MIME_TYPE = "video/avc";
protected static final int FRAME_INTERVAL = 5; // Insert a keyframe every 5 seconds
protected static final int FRAME_RATE = 30;
protected static final float BPP = 0.50 f;
protected int mColorFormat = MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface;

private void initMediaCodec(a){
    final MediaFormat format = MediaFormat.createVideoFormat(MIME_TYPE, mWidth, mHeight);
    format.setInteger(MediaFormat.KEY_COLOR_FORMAT, mColorFormat);
    format.setInteger(MediaFormat.KEY_BIT_RATE, calcBitRate());
    format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
    format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, FRAME_INTERVAL);

    try {
        mMediaCodec = MediaCodec.createEncoderByType(MIME_TYPE);
    } catch (IOException e) {
        e.printStackTrace();
    }
    mMediaCodec.configure(format, null.null, MediaCodec.CONFIGURE_FLAG_ENCODE);
    // get Surface for encoder input
    // this method only can call between #configure and #start
    // API >= 18
    mSurface = mMediaCodec.createInputSurface();
    mMediaCodec.start();
}

protected int calcBitRate(a) {
    final int bitrate = (int) (BPP * FRAME_RATE * mWidth * mHeight);
    Log.i(TAG, String.format("Bitrate = % 5.2 f (Mbps)", bitrate / 1024f / 1024f));
    return bitrate;
}
Copy the code

In addition, as long as the MediaFormat configuration is ok, even if the timestamp is not transmitted, it does not affect emMM… Now that the timestamp code has been written, and there are no other pits, I’d better take the timestamp information with me to be safe.

If this article is helpful to you, please click on my wechat official number: FSA Full Stack Action, which will be the biggest incentive for me. The public account not only has Android technology, but also iOS, Python and other articles, which may have some skills you want to know about oh ~

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Android Audio and Video – MediaCodec encoded MP4 trampling record

First, video discoloration

1, the phenomenon of

2, analysis,

3, implementation,

Ii. Video length shrinkage (frame loss)

1, the phenomenon of

2, analysis,

3, implementation,

4, repair

Android Audio and Video – MediaCodec encoded MP4 trampling record

First, video discoloration

1, the phenomenon of

2, analysis,

3, implementation,

Ii. Video length shrinkage (frame loss)

1, the phenomenon of

2, analysis,

3, implementation,

4, repair

Related Posts

Is the Android development market really saturated in 2020?

Android removes the default shadow of CardView

Android- Broadcast sending and receiving