Render YUV video based on OpenGl on Android platform

This article first appeared on my CSDN blog: blog.csdn.net/sinat_23092…

preface

This is my first example of how to render a YUV video using OpenGl.

This post covers three main points:

1. The concept of yuv

2. Basic programming of C++ programs based on NDK

3.OpenGl texture rendering

This article will focus on knowledge points 1 and 3, NDK development part will not be discussed in detail, because OpenGl knowledge system is huge, this article is also based on the focus of analysis, so if you do not have the foundation of NDK development and OpenGl basic readers may be more difficult to read this article.

Talk about YUV

YUV is a color coding method. Often used in various image processing components. Y “refers to Luminance and Luma,” U “and” V “refer to the coding format RGB (Chrominance and Chroma) that we are familiar with. RGB appeals to the perception of color by human eyes, while YUV focuses on the sensitivity of vision to brightness. YUV allows for reduced chromaticity bandwidth, taking human perception into account when encoding photos or movies. In other words, encoding allows for more Y than UV, allowing for downsampling of the UV component of the image so that the data takes up less space than RGB (downsampling simply means sampling at a lower rate than the original sample). For details on Downsampling, please refer to This article on Zhihu. Oversampling, undersampling, downsampling, upsampling what are the differences and connections between the four concepts of oversampling, undersampling, downsampling, and upsampling? .

Y, U, and V in the image:

It’s a Bit abstract, so take a look at this famous article from Microsoft: Video Rendering with 8-bit YUV Formats

Here we mainly talk about two aspects of YUV, which are the sampling format and the storage format. The sampling format is simple to understand how each pixel samples each component of YUV in an original image. For example, y component (or U or V) is collected every few pixels. The storage format simply refers to the way in which the sample is stored, such as which byte is stored y and which byte is stored U.

Yuv sampling format:

The “YUV Sampling” section of the article explains in detail how yuVS of various formats are sampled. Here is a translation of the excerpt:

One of the advantages of YUV is that the sampling rate of chromaticity channel is lower than that of Y channel on the premise that the perceived quality does not significantly decrease. Generally, A symbol called A:B:C (y:u: V) is used to describe the sampling frequency of U and V relative to Y. For easy understanding, it is described in the figure, where y component is represented by X and UV is represented by O:

4:4:4:

It means that the chromaticity channel is not down-sampled, which means that all three channels of YUV are fully sampled:

4:2:2:

Represents a 2:1 horizontal down sampling, no vertical down sampling. Each scan line contains four Y samples corresponding to two U or V samples. That is, the method of sampling in the horizontal direction according to Y: UV with 2:1 and full sampling in the vertical direction:

4:2:0:

It represents 2:1 horizontal sampling and 2:1 vertical sampling. That is, the horizontal direction according to y: UV using 2:1 sampling, vertical direction according to Y: UV using 2:1 sampling:

Note that 4:2:0 here does not mean that Y :u:v = 4:2:0, it means that only one chromaticity component (U or V) is scanned in each line, and the y component is sampled in the manner of 2:1. For example, YU samples the first row in a 2:1 fashion, while YV components are sampled in a 2:1 fashion in the second row. So the ratio of y to either u or v is 2:1.

4:1:1:

Represents 4:1 horizontal sampling, no vertical sampling. Each scan line contains four Y samples for each U or V sample.

4:1:1 sampling is less common than other formats and is not discussed in detail in this article.

Yuv storage format:

YUV storage formats come in two broad categories: planar and Packed: Packed: Y, U and V components are stored in an array. The Y,U, and V of each pixel are continuously interleaved. And RGB storage format similar. The planar: Y, U, and V components are stored in three separate arrays.

Y, U, and V each sampling point uses 8 bits for storage.

Next, we will elaborate on the common YUV format storage methods in the next video:

4:2:2 formats:

There are two main specific formats:

YUY2:

It is of type Packed and YUY2 format. The data can be treated as an unsigned char array. The first byte contains the first Y sample, the second byte contains the first U (Cb) sample, the third byte contains the second Y sample, the fourth byte contains the V (Cr) sample, and so on, as shown in the figure:

UYVY:

Also belongs to the Packed type, and is similar to YUY2 and, but the storage direction is opposite:

4:2:0 formats

The format also contains a variety of storage methods, here will focus on the following:

YUV420P and YUV 420SP are stored in Planar mode. After storing all Y components, YUV420P stores all U or V components. YUV420SP is stored in alternating order of UV or VU, see the following figure (figure from: basic knowledge of audio and video – pixel format YUV) :

YUV420P:

(Here needs to type the blackboard, because this article plays yuV is YUV420P format, familiar with its storage format can understand the code to read the logic of video frame data)

Because YUV420P is sampled 2:1 horizontally and 2:1 vertically, the number of y components is equal to the width of the video, and both the U and V components are the width of the video times the height /4

YUV420SP

Yuv to RGB:

At present, the general decoded video format is YUV, but the general graphics card rendering format is RGB, so yuV needs to be converted to RGB.

Here is a formula for yuV to RGB:

Yuv is introduced here first, familiar with YUV for yuV video playback is crucial.

Talk about OpenGl

OpenGL is the most widely accepted 2D/3D graphics API in the industry. OpenGL is a cross-platform software interface language used to call 2D and 3D graphics processors of hardware. Since it is only a software interface, the specific underlying implementation depends on the hardware device manufacturer.

Welcome to the World of OpenGl (the following description is excerpted from OpenGl)

Android uses the OpenGl ES version, which is a subset of OpenGl with features tailored specifically for embedded devices.

OpenGL graphics rendering pipeline

The first thing to explain is OpenGl’s graphics rendering pipeline: it’s the process by which a bunch of raw graphics data passes through a pipeline and is processed in various ways to end up on the screen. There are two main parts: the first part converts your 3D coordinates into 2D coordinates, and the second part converts 2D coordinates into actual colored pixels.

The graphics rendering pipeline takes a set of 3D coordinates and turns them into colored 2D pixel output on your screen. The graphics rendering pipeline can be divided into several phases, each of which will take the output of the previous phase as input.

Most graphics cards today have thousands of small processing cores that run independent parallel processing applets on the GPU for each stage to quickly process your data in the graphics rendering pipeline. These little programs are called shaders, and because they run on the GPU, they free up CPU productivity.

Presentation of each stage of the graphic rendering pipeline:

The first part of the graphics rendering pipeline is the Vertex Shader, which takes a single Vertex as input. The main purpose of vertex shaders is to convert 3D coordinates to another 3D coordinate system (transformation of coordinate systems), and vertex shaders allow us to do some basic processing on vertex properties. Vertex shader code is executed once per vertex.

The Primitive Assembly stage takes all the vertices output by the vertex shader as input (one vertex if GL_POINTS) and assembles them into the shape of the specified primitives. For example, assemble vertices as triangles or rectangles.

The output of the geometry Shader is passed into the Rasterization Stage, where it maps primions to corresponding pixels on the final screen, generating fragments for the Fragment Shader to use. OpenGL renders all the data needed for a pixel).

The main purpose of the fragment shader is to calculate the final color of a pixel, and this is where all of OpenGL’s advanced effects are generated. The fragment shader is executed once per fragment (pixel).

The main thing to deal with is the code logic for vertex shaders and fragment shaders, which are written in a C-like language called GLSL and contain some useful features for vector and matrix operations. See shader for syntax details

OpenGL coordinate system

To write the vertex shader code, we need to know the OpenGL vertex coordinate system:

By convention, OpenGL is a right handed coordinate system. Basically, the X-axis is on your right hand side, the Y-axis is pointing up, and the Z-axis is pointing backwards. Imagine your screen at the center of three axes, with the positive Z axis pointing across your screen towards you:

(One thing to note here is that After executing the vertex shader, OpenGl performs an assembly line of coordinates in 5 steps: local coordinates — world coordinates — view coordinates — clip-out coordinates — screen coordinates. I don’t need to worry about these for now because the instance is 2D.)

Now remember that OpenGL only handles 3D coordinates if they are in the range of -1.0 to 1.0 on all three axes (x, y, and Z). All Coordinates within the range of so-called Normalized Device Coordinates will eventually appear on the screen (Coordinates outside this range will not be displayed).

In 2D case, since z-axis is not considered, generally speaking, the vertex coordinate system is as follows:

OpenGL texture drawing

With vertex shaders and fragment shaders, we can specify the shape, size and color of the object to draw, but what if we want to do something like draw an image on it?

OpenGl provides the concept of texture, which allows you to “paste” an image to where you want it. (See texture for details)

So how do textures “stick” to shapes? In fact, the image is sampled, and then the sampled color data is drawn to the corresponding position of the graph.

To be able to Map textures onto our graph, we need to specify which part of the texture corresponds to each vertex of the graph. So each vertex of the graph is associated with a texture coordinate, which indicates which part of the texture image to sample from.

If you want to “paste” an image (texture) onto the rectangle, you need to specify a texture coordinate that tells OpenGl which pixel of the image corresponds to each segment of the rasterized rectangle. Texture coordinates are simply the coordinate system with a point of a texture image as the origin.

As shown below:

The coordinate system starting from the upper left corner of the picture.

So after providing vertex coordinates and texture coordinates, OpenGL knows how to draw the color to the corresponding position on the graph represented by vertex coordinates by sampling the color data of the pixels on the texture.

So much for textures, there are many more specific sampling details to note, also see the detailed tutorial on Textures

Program Instance analysis

The so-called work to good things will first sharpen its tools, basic knowledge about, then to enter the most important code link, the yuV format used here is YUV420P.

Here, cmake is used for construction, and native-lib is the dynamic library name defined by the project. Other dynamic libraries that need to be linked are configured as follows:

find_library( # Sets the name of the path variable.
              log-lib

              # Specifies the name of the NDK library that
              # you want CMake to locate.
              log )

target_link_libraries( # Specifies the target library.
                       native-lib
                       GLESv2
                       EGL
                       android
                       # Links the target library to the log library
                       # included in the NDK.
                       ${log-lib} )
Copy the code

The Java layer first creates a class that integrates GLSurfaceView:

public class YuvPlayer extends GLSurfaceView implements Runnable.SurfaceHolder.Callback.GLSurfaceView.Renderer {
	// Place the yuV video file in the sdcard directory
    private final static String PATH = "/sdcard/sintel_640_360.yuv";

    public YuvPlayer(Context context, AttributeSet attrs) {
        super(context, attrs);
        setRenderer(this);
    }

    @Override
    public void surfaceCreated(SurfaceHolder holder) {
        new Thread(this).start();
    }

    @Override
    public void surfaceDestroyed(SurfaceHolder holder) {}@Override
    public void surfaceChanged(SurfaceHolder holder, int format, int w, int h) {}@Override
    public void run(a) {
        loadYuv(PATH,getHolder().getSurface());
    }
	// Define a native method to load yuV video files
    public native void loadYuv(String url, Object surface);

    @Override
    public void onSurfaceCreated(GL10 gl, EGLConfig config) {}@Override
    public void onSurfaceChanged(GL10 gl, int width, int height) {}@Override
    public void onDrawFrame(GL10 gl) {}}Copy the code

LoadYuv method to enter native layer:

Java_com_example_yuvopengldemo_YuvPlayer_loadYuv(JNIEnv *env, jobject thiz, jstring jUrl,
                                                 jobject surface) {
    const char *url = env->GetStringUTFChars(jUrl, 0);
	// Open the yuV video file
    FILE *fp = fopen(url, "rb");
    if(! fp) {// use the Log method
        LOGD("oepn file %s fail", url);
        return;
    }
    LOGD("open ulr is %s", url);
Copy the code

We first convert the JString variable passed in from the Java layer to char*, and then open the YUV video file.

Next, initialize EGL:

So here’s a quick explanation of what EGL is.

EGL™ is the interface between Khronos rendering apis, such as OpenGL ES or OpenVG, and the underlying native platform window system. It handles graphics context management, surface/buffer binding, and rendering synchronization, and supports high-performance, accelerated, mixed-mode 2D and 3D rendering using other Khronos apis. EGL also provides interoperability between Khronos to enable efficient transfer of data between apis — for example, between a video subsystem running OpenMAX AL and a GPU running OpenGL ES.

In layman’s terms, EGL is the interface between rendering apis (such as OpenGL, OpenGL ES, OpenVG) and the local window system. EGL can be understood as a bridge between OpenGl ES ES and devices. EGL provides OpenGl with a drawing surface. Because OpenGl is cross-platform, it needs EGL as an intermediary adapter when accessing devices on different platforms.

//1. Get the original window
ANativeWindow *nwin = ANativeWindow_fromSurface(env, surface);
    // Get the render target of OpenGl ES. Display(EGLDisplay) is an abstraction of the actual Display device.
    EGLDisplay display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
    if (display == EGL_NO_DISPLAY) {
        LOGD("egl display failed");
        return;
    }
    //2. Initialize the connection between egL and EGLDisplay. The last two parameters are the major and minor versions
    if(EGL_TRUE ! = eglInitialize(display,0.0)) {
        LOGD("eglInitialize failed");
        return;
    }
	// Create surface for rendering
    / / 2.1 surface configuration
    EGLConfig eglConfig;
    EGLint configNum;
    EGLint configSpec[] = {
            EGL_RED_SIZE, 8,
            EGL_GREEN_SIZE, 8,
            EGL_BLUE_SIZE, 8,
            EGL_SURFACE_TYPE, EGL_WINDOW_BIT,
            EGL_NONE
    };

    if(EGL_TRUE ! = eglChooseConfig(display, configSpec, &eglConfig,1, &configNum)) {
        LOGD("eglChooseConfig failed");
        return;
    }

    //2.2 Create surface(associate EGL with NativeWindow, that is, connect EGL to the device screen. The last parameter is the property information, with 0 indicating the default version. Surface (EGLSurface) is an abstraction of the FrameBuffer area of memory used to store images. This is the Surface we're rendering
    EGLSurface winSurface = eglCreateWindowSurface(display, eglConfig, nwin, 0);
    if (winSurface == EGL_NO_SURFACE) {
        LOGD("eglCreateWindowSurface failed");
        return;
    }

    //3 Create an association context
    const EGLint ctxAttr[] = {
            EGL_CONTEXT_CLIENT_VERSION, 2, EGL_NONE
    };
    // Create an EGLContext instance of egL associated with OpenGl. EGL_NO_CONTEXT indicates that multiple devices do not need to share the context. Context (EGLContext) stores some state information for OpenGL ES drawing. The above code is just the association between EGL and the device window, here is the association with OpenGl
    EGLContext context = eglCreateContext(display, eglConfig, EGL_NO_CONTEXT, ctxAttr);
    if (context == EGL_NO_CONTEXT) {
        LOGD("eglCreateContext failed");
        return;
    }
    // Actually associate EGLContext with OpengL. Bind the thread's display device and context
    // Read and write two surfaces.
    if(EGL_TRUE ! = eglMakeCurrent(display, winSurface, winSurface, context)) { LOGD("eglMakeCurrent failed");
        return;
    }
Copy the code

Create the initialization EGL, and then the actual OpenGl drawing code.

Take a look at the shader code. Before looking at the shader code, let’s take a look at the basics of GLSL: common variable type: attritude: used for variables that vary by vertices. Such as vertex positions, texture coordinates, normal vectors, colors, and so on. Uniform: Generally used when all vertices or all segments of an object are the same. Such as light source position, uniform transformation matrix, color, etc. Varying: Specifies the mutable variable, which is generally used to specify the amount passed from the vertex shader to the fragment shader. Vec2: a vector containing two floating points VEC3: a vector containing three floating points VEC4: a vector containing four floating points sampler1D: 1D texture shader sampler2D: 2D texture shader sampler3D: 3D texture shader

Write the vertex shader code first:

// The vertex shader executes once per vertex and can be executed in parallel
#define GET_STR(x) #x
static const char *vertexShader = GET_STR(
        attribute vec4 aPosition;// The input vertex coordinates will be specified in the program to enter data into this field
        attribute vec2 aTextCoord;// The input texture coordinates will be specified in the program to input data into this field
        varying vec2 vTextCoord;// Output texture coordinates to the fragment shader
        void main() {
            // This is actually flipped upside down (since Android images automatically flip up and down, so flip back. You can flip it up and down in vertex coordinates.)
            vTextCoord = vec2(aTextCoord.x, 1.0 - aTextCoord.y);
            // Pass coordinate values directly into the rendering pipeline. Gl_Position is built into OpenGLgl_Position = aPosition; });Copy the code

The logic here is simple. Use two attribute variables, one accepts vertex coordinates and the other receives texture coordinates. Here, the standard OpenGl texture coordinates are used as the standard, that is, the relationship with the Android platform is flipped up and down (described in the OpenGl texture rendering section of this article), so the texture coordinates passed in are between 0.0 and 1. 0, and assign to the variable vTextCoord, which will be passed to the fragment shader via the render pipeline. Finally, the passed vertex coordinates are assigned to gl_Position, which is OpenGL’s built-in variable for vertex coordinates. Once gl_Position is assigned, it is passed through the render pipeline to the subsequent stages to connect the vertices while the primitives are assembled. When rasterizing primitives, the line segment between the two vertices is broken into a large number of small fragments. Varying data is computed and generated during this process, recorded in each fragment, and then passed to the fragment shader.

Then write the fragment shader code:

// The pixel is called as many times as the fragment is rasterized
static const char *fragYUV420P = GET_STR(
        precision mediump float;
        // Receive texture coordinate data from vertex shaders and rasterizers
   	    varying vec2 vTextCoord;
        // Input three yuV textures
        uniform sampler2D yTexture;// Y component texture
        uniform sampler2D uTexture;// U component texture
        uniform sampler2D vTexture;// V component texture
        void main() {
        	// Store the sampled YUV data
            vec3 yuv;
            // Store RGB data after yuV data conversion
            vec3 rgb;
            // Sample pixels of vTextCoord corresponding to each component of YUV. The result of texture2D here is a VEC4 variable whose r, G, B, and A values are the values of the sampled component
            // The sampled y, U and V components are stored in the R, G, b (or x, y, z) components of VEC3 YUV respectively
     	    yuv.r = texture2D(yTexture, vTextCoord).g;
            yuv.g = texture2D(uTexture, vTextCoord).g - 0.5;
            yuv.b = texture2D(vTexture, vTextCoord).g - 0.5;
            // we must convert yuV to RGB
            rgb = mat3(
                    1.0.1.0.1.0.0.0.0.39465.2.03211.1.13983.0.5806.0.0
            ) * yuv;
            //gl_FragColor is built into OpenGL and assigns RGB data to gl_FragColor, which is passed to the next stage of the rendering pipeline. Gl_FragColor represents the R, G, B, and A values of the pixel being rendered.
            gl_FragColor = vec4(rgb, 1.0); });Copy the code

Here, the yuV components are rendered with three layers of textures, and the layers are mixed together to display. The three sampler2D variables in the code are texture images that need to be passed in from an external program. Then, the color data of the corresponding texture coordinate position is sampled by texture2D method, and the sample values of the three components of YUV are put into the three components of the vec3 type variable YUV, because OpenGl only supports RGB rendering. Therefore, the YUV of VEC3 type needs to be converted into an RGB VEC3 type variable through the formula. Finally, build a VEC4 variable from the RGB variable and assign it to gl_FragColor as the final color.

With the shader code defined, the rendering logic is next.

The first step is to load and compile the shader defined above and create, link, and activate the shader program:

GLint vsh = initShader(vertexShader, GL_VERTEX_SHADER);
    GLint fsh = initShader(fragYUV420P, GL_FRAGMENT_SHADER);

    // Create the renderer
    GLint program = glCreateProgram();
    if (program == 0) {
        LOGD("glCreateProgram failed");
        return;
    }

    // Add a shader to the renderer
    glAttachShader(program, vsh);
    glAttachShader(program, fsh);

    // Link program
    glLinkProgram(program);
    GLint status = 0;
    glGetProgramiv(program, GL_LINK_STATUS, &status);
    if (status == 0) {
        LOGD("glLinkProgram failed");
        return;
    }
    LOGD("glLinkProgram success");
    // Activate the renderer
    glUseProgram(program);
Copy the code

InitShader:

GLint initShader(const char *source, GLint type) {
    / / create a shader
    GLint sh = glCreateShader(type);
    if (sh == 0) {
        LOGD("glCreateShader %d failed", type);
        return 0;
    }
    / / load shader
    glShaderSource(sh,
                   1./ / shader
                   &source,
                   0);// Code length, pass 0 to read the end of the string

    / / compile a shader
    glCompileShader(sh);

    GLint status;
    glGetShaderiv(sh, GL_COMPILE_STATUS, &status);
    if (status == 0) {
        LOGD("glCompileShader %d failed", type);
        LOGD("source %s", source);
        return 0;
    }

    LOGD("glCompileShader %d success", type);
    return sh;
}
Copy the code

Pass the vertex coordinates array to the vertex shader:

// Add 3d vertex data. So this is the whole rectangle of the screen.
    static float ver[] = {
            1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f
    };
	// Get a reference to the aPosition property of the vertex shader
    GLuint apos = static_cast<GLuint>(glGetAttribLocation(program, "aPosition"));
    glEnableVertexAttribArray(apos);
    // Pass the vertex coordinates to the aPosition property of the vertex shader
    Apos: a reference to the aPosition variable in the vertex shader. 3 means that three numbers in the array represent one vertex. GL_FLOAT indicates that the data type is floating point.
    //GL_FALSE indicates no normalization. 0 represents the stride, which is used when the array represents multiple attributes. Here, because there is one attribute, set it to 0. Ver represents the address of the vertices array passed in
    glVertexAttribPointer(apos, 3, GL_FLOAT, GL_FALSE, 0, ver);
Copy the code

(Accustomed to Java development students are afraid to see this kind of code is not used to it??)

Pass the texture coordinate array to the vertex shader:

// Add texture coordinate data. Here is the entire texture.
    static float fragment[] = {
            1.0 f.0.0 f.0.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f
    };
    //// passes the texture coordinate array to the aTextCoord property of the vertex shader
    GLuint aTex = static_cast<GLuint>(glGetAttribLocation(program, "aTextCoord"));
    glEnableVertexAttribArray(aTex);
    ATex: a reference to the aTextCoord variable in the vertex shader. 2 means that three numbers in the array represent one vertex. GL_FLOAT indicates that the data type is floating point.
    //GL_FALSE indicates no normalization. Stride, used when arrays represent multiple properties, because there's one property here, set it to 0. Fragment represents the address of the array of vertices passed in
    glVertexAttribPointer(aTex, 2, GL_FLOAT, GL_FALSE, 0, fragment);
Copy the code

If you can pass the vertex coordinate array to the vertex shader to understand, this section is not difficult.

Next comes the handling of the texture object:

1. A texture object is a video memory created to store a texture. In actual use, the texture ID returned after creation is used. 2. The texture target can be simply understood as the texture type, such as specifying whether to render 2D or 3D, etc. 3. Texture unit: Texture operation containers, including GL_TEXTURE0, GL_TEXTURE1, GL_TEXTURE2, etc. The number of texture units is limited, up to 16. So you can only manipulate a maximum of 16 textures at the same time. It can be simply understood as the layers of texture.

Create a texture object:

   // Specify which layer of texture unit to render the texture variable
   glUniform1i(glGetUniformLocation(program, "yTexture"), GL_TEXTURE0);
    glUniform1i(glGetUniformLocation(program, "uTexture"), GL_TEXTURE1);
    glUniform1i(glGetUniformLocation(program, "vTexture"), GL_TEXTURE2);
    / / texture ID
    GLuint texts[3] = {0};
    // Create 3 texture objects and get their texture ids. Subsequent operations on the texture can be performed using the texture ID.
    glGenTextures(3, texts);
Copy the code

Bind the texture object to the corresponding texture target:

// YuV video width and height
int width = 640;
int height = 360;
The glBindTexture function binds the texture target to a texture object whose ID is texts[0], and all the operations performed on the texture target are reflected on the texture object
    glBindTexture(GL_TEXTURE_2D, texts[0]);
    // Zoom out filter (see [texture](https://learnopengl-cn.github.io/01%20Getting%20started/06%20Textures/))
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    // Zoom filter
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    // Set the format and size of the texture
    // The currently bound texture object will be textured
     glTexImage2D(GL_TEXTURE_2D,
                 0.// Specify the level to be Mipmap
                 GL_LUMINANCE,// Gpu internal format, which tells OpenGL what format to store and use the texture data internally. Brightness, grayscale map (here is the meaning of only one brightness color channel, because only one yuV component is taken here)
                 width,// The width of the loaded texture. It's better to be a power of two
                 height,// The height of the loaded texture. It's better to be a power of two
                 0.// Texture border
                 GL_LUMINANCE,// Data in pixel format brightness, grayscale map
                 GL_UNSIGNED_BYTE,// The type of data stored in a pixel
                 NULL // Texture data (not transmitted for now, but for each subsequent frame refresh)
    );
Copy the code

Note that the width and height of the video must be set correctly, otherwise the rendered data will be all wrong.

The third parameter to glTexImage2D tells OpenGL how to store and use the texture data internally (how many color components a pixel contains, whether to compress or not). Common constants are as follows:

Here, the codes for the three components of YUV are the same, but the width and height passed in are different. For u and V, the width and height are half of the width and height of each video:

// Set the format and size of the texture
    glTexImage2D(GL_TEXTURE_2D,
                 0.// Details basically default to 0
                 GL_LUMINANCE,// Gpu internal format brightness, grayscale
                 width / 2,
                 height / 2.// The amount of v data is 1/4 of the screen
                 0./ / frame
                 GL_LUMINANCE,// Data in pixel format brightness, grayscale map
                 GL_UNSIGNED_BYTE,// The type of data stored in the pixel
                 NULL // Texture data (not passed)
    );
Copy the code

Why width over 2,height over 2? Remember the yuV420p sampling and storage format mentioned above? YUV420P is 2:1 sampled horizontally and 2:1 sampled vertically, so the number of y components is equal to the width of the video times the height, and the u and V components are both video width /2 times the height /2.

Read YUV data from a video file into memory:

	unsigned char *buf[3] = {0};
    buf[0] = new unsigned char[width * height];//y
    buf[1] = new unsigned char[width * height / 4];//u
    buf[2] = new unsigned char[width * height / 4];//v
	// Loop out each frame
    for (int i = 0; i < 10000; ++i) {
        // Read a frame of yuv420p
        if (feof(fp) == 0) {
        	// Read y data
            fread(buf[0].1, width * height, fp);
            // Read u data
            fread(buf[1].1, width * height / 4, fp);
            // Read v data
            fread(buf[2].1, width * height / 4, fp);
        }
Copy the code

So in yuV420p, it stores the width and height y element, then the width times the height /4 U, and then the width times the height /4 V, so when you read a frame in the for loop, you read it into the memory array in the order and number of YUVs.

// Activate the first texture and bind it to the created texture
      
        glActiveTexture(GL_TEXTURE0);
        // bind the texture corresponding to y
        glBindTexture(GL_TEXTURE_2D, texts[0]);
        // Replace textures with much better performance than using glTexImage2D
        glTexSubImage2D(GL_TEXTURE_2D, 0.0.0.// Offset from the original texture
                        width, height,// The width and height of the loaded texture. It's better to be a power of two
                        GL_LUMINANCE, GL_UNSIGNED_BYTE,
                        buf[0]);
Copy the code

U and v are the same, but width / 2, height / 2.

Finally, the screen is displayed:

glDrawArrays(GL_TRIANGLE_STRIP, 0.4);
// The window shows that double buffers are swapped
eglSwapBuffers(display, winSurface);
Copy the code

This loop renders each frame and plays the YUV video:

Here I used ffmpeg command to convert 10 seconds of video from My Neighbor Totoro into YUV. The screen recording GIF somehow failed to upload, so I only uploaded a screenshot here = =.

Although the video was only 10 seconds long, it had already exceeded the maximum upload amount on Github, so the video was not uploaded. If you need to you can use ffMpeg command to convert any format support video files to yuV420P format to run.

Contact audio and video development field is not long, if there are mistakes, please correct ~

Project address: YuvVideoPlayerDemo

Introduce a just released audio and video playback recording open source project

References:

learnopengl

Video Rendering with 8-Bit YUV Formats

Audio and video basics – Pixel format YUV

OpenGl Super Treasure Book 5th Edition

Android OpenGL ES Video application development tutorial directory

Android custom camera development (3) — Learn about EGL