[sound Ming]

First of all, this series of articles are based on their own understanding and practice, there may be wrong places, welcome to correct.

Secondly, this is an introductory series, covering only enough knowledge, and there are many blog posts on the Internet for in-depth knowledge. Finally, in the process of writing the article, I will refer to the articles shared by others and list them at the end of the article, thanking these authors for their sharing.

Code word is not easy, reproduced please indicate the source!

Tutorial code: [Making portal】

directory

First, Android audio and video hard decoding:
  • 1. Basic knowledge of audio and video
  • 2. Audio and video hard decoding process: packaging basic decoding framework
  • 3. Audio and video playback: audio and video synchronization
  • 4, audio and video unencapsulation and packaging: generate an MP4
2. Use OpenGL to render video frames
  • 1. Preliminary understanding of OpenGL ES
  • 2. Use OpenGL to render video images
  • 3, OpenGL rendering multi-video, picture-in-picture
  • 4. Learn more about EGL of OpenGL
  • 5, OpenGL FBO data buffer
  • 6, Android audio and video hardcoding: generate an MP4
Android FFmpeg audio and video decoding
  • 1, FFmpeg SO library compilation
  • 2. Android introduces FFmpeg
  • 3, Android FFmpeg video decoding playback
  • 4, Android FFmpeg+OpenSL ES audio decoding playback
  • 5, Android FFmpeg+OpenGL ES play video
  • Android FFmpeg Simple Synthesis MP4: Video unencapsulation and Reencapsulation
  • 7, Android FFmpeg video encoding

You can read about it in this article

The simple application of OpenGL ES on Android was introduced above. Based on the basic knowledge above, this paper will use OpenGL to render the video picture, and explain the knowledge related to picture projection, so as to solve the problem of picture pulling and lifting deformation.

First, render the video picture

In the first article [Basic knowledge of audio and Video], it was introduced that video is actually composed of a picture. In the above article [Preliminary Understanding of OpenGL ES], it was introduced how to render a picture through OpenGL. It can be guessed that the rendering of video and the rendering of picture should be about the same. Without further ado, I’ll take a look.

1. Define the video renderer

In the previous section, we defined the video render interface class

interface IDrawer {
    fun draw(a)
    fun setTextureID(id: Int)
    fun release(a)
}
Copy the code

To implement the above interface, define a video renderer

class VideoDrawer : IDrawer {

    // Vertex coordinates
    private val mVertexCoors = floatArrayOf(
        -1f, -1f.1f, -1f,
        -1f.1f.1f.1f
    )

    // Texture coordinates
    private val mTextureCoors = floatArrayOf(
        0f.1f.1f.1f.0f.0f.1f.0f
    )

    private var mTextureId: Int = -1

    / / OpenGL program ID
    private var mProgram: Int = -1
    // Vertex coordinates receiver
    private var mVertexPosHandler: Int = -1
    // Texture coordinates receiver
    private var mTexturePosHandler: Int = -1
    // Texture receiver
    private var mTextureHandler: Int = -1

    private lateinit var mVertexBuffer: FloatBuffer
    private lateinit var mTextureBuffer: FloatBuffer

    init {
        // step 1: Initialize vertex coordinates
        initPos()
    }

    private fun initPos(a) {
        val bb = ByteBuffer.allocateDirect(mVertexCoors.size * 4)
        bb.order(ByteOrder.nativeOrder())
        // Convert coordinate data to FloatBuffer, which is passed to OpenGL ES program
        mVertexBuffer = bb.asFloatBuffer()
        mVertexBuffer.put(mVertexCoors)
        mVertexBuffer.position(0)

        val cc = ByteBuffer.allocateDirect(mTextureCoors.size * 4)
        cc.order(ByteOrder.nativeOrder())
        mTextureBuffer = cc.asFloatBuffer()
        mTextureBuffer.put(mTextureCoors)
        mTextureBuffer.position(0)}override fun setTextureID(id: Int) {
        mTextureId = id
    }

    override fun draw(a) {
        if(mTextureId ! = -1) {
            // step 2: Create, compile, and start the OpenGL shader
            createGLPrg()
            // step 3: Activate and bind the texture unit
            activateTexture()
            // [Step 4: Bind image to texture unit]
            updateTexture()
            // [Step 5: Start rendering]
            doDraw()
        }
    }

    private fun createGLPrg(a) {
        if (mProgram == -1) {
            val vertexShader = loadShader(GLES20.GL_VERTEX_SHADER, getVertexShader())
            val fragmentShader = loadShader(GLES20.GL_FRAGMENT_SHADER, getFragmentShader())

            // Create OpenGL ES program, note: need to create in OpenGL render thread, otherwise cannot render
            mProgram = GLES20.glCreateProgram()
            // Add a vertex shader to the program
            GLES20.glAttachShader(mProgram, vertexShader)
            // Add a chip shader to the program
            GLES20.glAttachShader(mProgram, fragmentShader)
            // Connect to the shader program
            GLES20.glLinkProgram(mProgram)

            mVertexPosHandler = GLES20.glGetAttribLocation(mProgram, "aPosition")
            mTextureHandler = GLES20.glGetUniformLocation(mProgram, "uTexture")
            mTexturePosHandler = GLES20.glGetAttribLocation(mProgram, "aCoordinate")}// Use the OpenGL program
        GLES20.glUseProgram(mProgram)
    }

    private fun activateTexture(a) {
        // Activate the specified texture unit
        GLES20.glActiveTexture(GLES20.GL_TEXTURE0)
        // Bind the texture ID to the texture unit
        GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, mTextureId)
        // Pass the active texture unit to the shader
        GLES20.glUniform1i(mTextureHandler, 0)
        // Set edge transition parameters
        GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MIN_FILTER, GLES20.GL_LINEAR.toFloat())
        GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MAG_FILTER, GLES20.GL_LINEAR.toFloat())
        GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_S, GLES20.GL_CLAMP_TO_EDGE)
        GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_T, GLES20.GL_CLAMP_TO_EDGE)
    }

    private fun updateTexture(a){}private fun doDraw(a) {
        // Enable vertex handles
        GLES20.glEnableVertexAttribArray(mVertexPosHandler)
        GLES20.glEnableVertexAttribArray(mTexturePosHandler)
        // Set the shader parameter. The second parameter represents the amount of data a vertex contains, which is xy, so 2
        GLES20.glVertexAttribPointer(mVertexPosHandler, 2, GLES20.GL_FLOAT, false.0, mVertexBuffer)
        GLES20.glVertexAttribPointer(mTexturePosHandler, 2, GLES20.GL_FLOAT, false.0, mTextureBuffer)
        // Start drawing
        GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0.4)}override fun release(a) {
        GLES20.glDisableVertexAttribArray(mVertexPosHandler)
        GLES20.glDisableVertexAttribArray(mTexturePosHandler)
        GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, 0)
        GLES20.glDeleteTextures(1, intArrayOf(mTextureId), 0)
        GLES20.glDeleteProgram(mProgram)
    }

    private fun getVertexShader(a): String {
        return "attribute vec4 aPosition;" +
                "attribute vec2 aCoordinate;" +
                "varying vec2 vCoordinate;" +
                "void main() {" +
                " gl_Position = aPosition;" +
                " vCoordinate = aCoordinate;" +
                "}"
    }

    private fun getFragmentShader(a): String {
        // Be sure to add a newline "\n", otherwise it will be mixed with the precision on the next line and cause compilation errors
        return "#extension GL_OES_EGL_image_external : require\n" +
                "precision mediump float;" +
                "varying vec2 vCoordinate;" +
                "uniform samplerExternalOES uTexture;" +
                "void main() {" +
                " gl_FragColor=texture2D(uTexture, vCoordinate);" +
                "}"
    }

    private fun loadShader(type: Int, shaderCode: String): Int {
        // Create a vertex shader or slice shader based on type
        val shader = GLES20.glCreateShader(type)
        // Add the resource to the shader and compile it
        GLES20.glShaderSource(shader, shaderCode)
        GLES20.glCompileShader(shader)

        return shader
    }
}
Copy the code

At first glance, it looks exactly like the rendered image. Don’t worry, let me tell you all about it. Take a look at this draw flow:


init {
    // step 1: Initialize vertex coordinates
    initPos()
}

override fun draw(a) {
    if(mTextureId ! = -1) {
        // step 2: Create, compile, and start the OpenGL shader
        createGLPrg()
        // step 3: Activate and bind the texture unit
        activateTexture()
        // [Step 4: Bind image to texture unit]
        updateTexture()
        // [Step 5: Start rendering]
        doDraw()
    }
}
Copy the code
  • The same place:
  1. Vertex coordinates and texture coordinates are set
  2. Create OpenGL Program, load GLSL Program process. (But just like the process, the details are different)
  3. The flow of the draw
  • The difference:
  1. Chip shader
// Video element shader

private fun getFragmentShader(a): String {
    // Be sure to add a newline "\n", otherwise it will be mixed with the precision on the next line and cause compilation errors
    return "#extension GL_OES_EGL_image_external : require\n" +
            "precision mediump float;" +
            "varying vec2 vCoordinate;" +
            "uniform samplerExternalOES uTexture;" +
            "void main() {" +
            " gl_FragColor=texture2D(uTexture, vCoordinate);" +
            "}"
}
Copy the code

Compare the pixel shader of the image

private fun getFragmentShader(a): String {
    return "precision mediump float;" +
            "uniform sampler2D uTexture;" +
            "varying vec2 vCoordinate;" +
            "void main() {" +
            " vec4 color = texture2D(uTexture, vCoordinate);" +
            " gl_FragColor = color;" +
            "}"
}
Copy the code

Well, the first line was added:

#extension GL_OES_EGL_image_external : require
Copy the code

The video is rendered using Android’s extended texture

Extension texture?

We already know that the color space of the video is YUV, and to display the video on the screen, the picture is RGB, so to render the video on the screen, YUV must be converted to RGB. Extending the texture serves the purpose of this transformation.

The texture unit in the fourth line has also been replaced with an extended texture unit.

uniform samplerExternalOES uTexture;
Copy the code
  1. Activate texture unit
private fun activateTexture(a) {
    // Activate the specified texture unit
    GLES20.glActiveTexture(GLES20.GL_TEXTURE0)
    // Bind the texture ID to the texture unit
    GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, mTextureId)
    // Pass the active texture unit to the shader
    GLES20.glUniform1i(mTextureHandler, 0)
    // Set edge transition parameters
    GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MIN_FILTER, GLES20.GL_LINEAR.toFloat())
    GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MAG_FILTER, GLES20.GL_LINEAR.toFloat())
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_S, GLES20.GL_CLAMP_TO_EDGE)
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_T, GLES20.GL_CLAMP_TO_EDGE)
}
Copy the code

Also, replace the normal texture unit with the extended texture unit gles11ext.gl_TEXture_external_oes

  1. Update texture unit
private fun updateTexture() {

}
Copy the code

Similar to rendering an image, to render an image, bind the image (such as a bitmap of the image) to a texture unit.

But why is it empty now? Because just using the above process, you can’t display the video.

Video rendering requires SurfaceTexture to update the screen. Now let’s see how it’s generated.

class VideoDrawer: IDrawer {
    / /...
    
    private var mSurfaceTexture: SurfaceTexture? = null
    
    
    override fun setTextureID(id: Int) {
        mTextureId = id
        mSurfaceTexture = SurfaceTexture(id)
    }
    / /...
}
Copy the code

A SurfaceTexture was added to the VideoDrawer, and in setTextureID, the SurfaceTexture was initialized with the texture ID.

In the updateTexture method

private fun updateTexture(a){ mSurfaceTexture? .updateTexImage() }Copy the code

At this point, you might be thinking that you can finally do it, but you’re still one step away.

Remember that in hard Decoding part 2 of the Package Foundation Decoding Framework, MediaCodec was mentioned to provide a Surface as a rendering Surface. And Surface needs a SurfaceTexture.

Therefore, we need to pass this SurfaceTexture to external use. First add a method in IDrawer


interface IDrawer {
    fun draw(a)
    fun setTextureID(id: Int)
    fun release(a)
    // Add a new interface to provide SurfaceTexture
    fun getSurfaceTexture(cb: (st: SurfaceTexture) - >Unit){}}Copy the code

The SurfaceTexture is passed back by a higher-order function argument. Details are as follows:

class VideoDrawer: IDrawer {

    / /...
    
    private var mSftCb: ((SurfaceTexture) -> Unit)? = null
    
    override fun setTextureID(id: Int){ mTextureId = id mSurfaceTexture = SurfaceTexture(id) mSftCb? .invoke(mSurfaceTexture!!) }override fun getSurfaceTexture(cb: (st: SurfaceTexture) - >Unit) {
        mSftCb = cb
    }
    
    / /...
}
Copy the code
2. Use OpenGL to play videos

Create a new page


      
<android.support.constraint.ConstraintLayout
        xmlns:android="http://schemas.android.com/apk/res/android"
        android:layout_width="match_parent"
        android:layout_height="match_parent">
    <android.opengl.GLSurfaceView
            android:id="@+id/gl_surface"
            android:layout_width="match_parent"
            android:layout_height="match_parent"/>
</android.support.constraint.ConstraintLayout>
Copy the code
class OpenGLPlayerActivity: AppCompatActivity() {
    val path = Environment.getExternalStorageDirectory().absolutePath + "/mvtest_2.mp4"
    lateinit var drawer: IDrawer

    override fun onCreate(savedInstanceState: Bundle?). {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_opengl_player)
        initRender()
    }

    private fun initRender(a) {
        drawer = VideoDrawer()
        drawer.getSurfaceTexture {
            // Initialize a Surface with the SurfaceTexture and pass it to MediaCodec for use
            initPlayer(Surface(it))
        }
        gl_surface.setEGLContextClientVersion(2)
        gl_surface.setRenderer(SimpleRender(drawer))
    }

    private fun initPlayer(sf: Surface) {
        val threadPool = Executors.newFixedThreadPool(10)

        val videoDecoder = VideoDecoder(path, null, sf)
        threadPool.execute(videoDecoder)

        val audioDecoder = AudioDecoder(path)
        threadPool.execute(audioDecoder)

        videoDecoder.goOn()
        audioDecoder.goOn()
    }
}
Copy the code

Basically the same as using SurfaceView for playback, much more initialization OpenGL and Surface. Very simple, specific look at the above code, not to explain.

If you use the code above to start playing the video, you will notice that the video is stretched to the size of the GLSurfaceView window, which is the full screen. Now let’s see how to correct the video so that it has the same proportions as it actually does.

Two, the picture proportion correction

projection

OpenGL world coordinates are a standardized coordinate system, xyz coordinates are in the range of (-1 ~ 1), the default start and end positions correspond to the four corners of the plane of the world coordinates. At this time the picture is spread over the whole screen, so there is no coordinate transformation of the picture will generally have deformation problems.

OpenGL provides two ways to adjust the picture scale, namely perspective projection and orthogonal projection.

What does projection do?

  1. The projection defines the scope of the clipping space, that is, the visual space of the object
  2. Project objects in the clipped space onto the screen

It’s not easy to talk about the projection of OpenGL, and it involves the definition of all kinds of Spaces in OpenGL, so here’s a quick list:

  • Local space: The space relative to the province of the object in which the origin is in the middle

  • World space: Coordinates of the OpenGL world

  • Observation space: the space of the observer (camera) is equivalent to the space seen by people’s eyes in the real world. Different observation positions and the same object will be different

  • Clipping space: The visible space within which objects are displayed on the screen

  • Screen space: Screen coordinate space, that is, mobile phone screen space

  • 1) Perspective projection

As you can see in the figure above, the principle of perspective projection is actually the principle of human eye image. Looking forward from the camera, there is a space of view similar to that of the human eye. What the human sees is projected onto the retina, while what the camera sees is projected onto the near-plane (the plane that is relatively close to the camera).

  • Camera position and orientation

First of all, cameras are not fixed and can be moved according to their own needs, so it is necessary to set the position and orientation of the camera, which is related to how to observe objects.

Remember, the camera is still in world coordinate space. So the camera position is set relative to the world coordinate origin.

  • Camera position

The OpenGL world coordinate system is a right hand coordinate system, plus X on the right hand side, plus Y up, plus Z across the screen toward you.

So the camera coordinate could be (0,0,5), which is a point on the Z axis.

  • Camera orientation

After setting the position of the camera, it is also necessary to set the orientation of the camera. The three direction vectors upX, upY and uZ, starting from the coordinate point of the camera, determine the orientation of the camera. So the composition of these three vectors is going to be the direction directly above the camera.

If the camera is analogous to the head of an adult, the direction of the composite vector is the orientation directly above the head.

For example, you can set the camera orientation to (0,1,0). At this time, the camera is located at (0,0,5) and the upward direction is the Y-axis. At this time, the camera just sees the plane composed of XY, which is the front of the picture.

If the camera’s orientation is set to (0, -1, 0), it corresponds to the person’s head going down, which is the picture seen upside down from the one above.

  • Near plane and far plane

Looking back at the perspective projection above, there are two planes on the right side of the camera, the one near the camera is the near plane, and the far side is the far plane.

  • Cut out space

You can see that the four edges of the far plane and the near plane converge to the position of the camera. The space bounded by these lines and two planes is the clipped space, that is, the visible space.

An object in this space, the line between its surface and the camera position, passes through the points left by the near plane, and forms an image, which is the projection of the object on the near plane, that is, the image seen on the mobile phone screen.

And the further away you are from the camera, the smaller the projection, just like the human eye.

Because of the same imaging principle as the human eye, perspective projection is often used in 3D rendering.

But this is also more complex, I am not very familiar with, in order to avoid error conduction, here do not do specific application. You can read other people’s articles to understand, such as “projection matrix and viewport transformation matrix”, “projection matrix”

The following is a common projection mode used in 2D rendering: orthogonal projection.

  • 2) Orthogonal projection

Like perspective projection, orthogonal projection also has camera, near plane and far plane, but the difference is that the camera does not focus on a point, but on parallel lines. So the visible form between the near and far planes is a cuboid.

In other words, the vision of orthogonal projection is no longer like that of human eyes. All objects in the clipped space, no matter near or far, as long as they are of the same size, have the same projection on the near plane, and no longer have the effect of larger near or smaller far.

This effect is ideal for rendering 2D images.

OpenGL provides the Matrix. OrthoM function to generate the orthogonal projection Matrix:

/**
 * Computes an orthographic projection matrix.
 *
 * @paramM returns the result orthogonal projection matrix *@paramMOffset Specifies the offset. The default value is 0@paramLeft Left plane distance *@paramRight distance from the right plane *@paramBottom Distance from the bottom plane *@paramDistance from top plane *@param* Near plane distance@paramFar Far plane distance */
public static void orthoM(float[] m, int mOffset,
    float left, float right, float bottom, float top,
    float near, float far)
Copy the code

In addition to setting the distance between the near plane and the far plane, you also need to set the distance between the top, bottom, left and right of the near plane. These four parameters correspond to the vertical distance between the four edges of the near plane quadrilateral and the origin. It’s the key to correcting the image.

  • Matrix transformation

In the world of image processing, image transformation is most used than matrix transformation, which requires a little knowledge of linear algebra.

Let’s start with a simple matrix multiplication:

Matrix multiplication is not the same as ordinary number multiplication. Multiply the rows of the first matrix by the columns of the second matrix, and then add the products of each as the first row and column of the result, i.e. :

1x1 + 1x0 + 1x0 = 1
Copy the code

And so on.

Unit matrix

And you can see that no matter what matrix you multiply by the one on the right, it’s the same thing as the first one. Just like multiplying a number by 1 doesn’t make a difference. So this matrix on the right is called the identity matrix.

Let’s do another matrix multiplication:

If you reduce the first two ones on the right-hand side by half, you get exactly the same thing as reducing the original matrix by half.

Imagine if the three numbers on the left-hand side of the matrix were x, Y, and z at the coordinate point? Now you should be able to guess how to correct the proportions.

Since the video screen is stretched, the most direct method is to shrink the stretched direction of the screen back by scaling, and matrix multiplication can meet the needs of scaling.

Let’s go back to the orthogonal projection method:

public static void orthoM(float[] m, int mOffset,
    float left, float right, float bottom, float top,
    float near, float far)
Copy the code

Here is the matrix generated by this method:

You can see that it’s actually generating a scaling matrix, and since z and W are both 0, you can ignore the last two columns. So let’s focus on the first two, 2 over (right-left) and 2 over (top-bottom), which are multiplied by xy, respectively, to scale.

  • Take a chestnut

Let’s say the video is 1000×500 wide and the GLSurfaceView is 1080×1920 wide

This is a horizontal video, and if you use the width to fit 500 to 1920, then in order to maintain the scale, you have to put the width at 1000 x (1920/500) = 3840, which is out of the width of 1080. So, you can only zoom in height so that the video doesn’t end up being larger than the width of the GLSurfaceView.

Horizontal width height after correct scaling: 1080×540 (500×1080/1000)

How many times did I scale it? Is it 540/500 = 1.08? Wrong!!!

If the screen is stretched and spread all over without scaling, the height of the screen should be 1920, so the correct scaling multiple should be 1920/540=3.555556 (not all divided).

Let’s see how to set left, right, top, and bottom.

According to the above analysis, the width of the video screen is directly stretched to the maximum window width, which is left = -1 by default; Right = 1 (Remember OpenGL world coordinates origin in the center of the screen?)

At this moment

Right-left = 2, so the first parameter of the scaling matrix is: 2/ (right-left) = 1Copy the code

So there’s no scaling.

To make the height shrink 3.555556 times, then

2 / (bottom) = 2 / (2x3.555556) = 1/555556Copy the code

Therefore, the parameters of orthogonal projection are:

private var mPrjMatrix = FloatArray(16)
    
Matrix.orthoM(mPrjMatrix, 0, -1.1.3.555556, -3.555556, -1.3)

//public static void orthoM(float[] m, int mOffset,
// float left, float right, float bottom, float top,
// float near, float far)
Copy the code

Now that we know how it works, let’s derive the scaling factor from GLSurfaceView and the original width to height ratio.

Again using the above example, the scaling factor 1920/540=3.555556 is equivalent to

1920/ (500*1080/1000GL_Height/ (GL_Width/Video_Width)) --> (GL_Height/GL_Width) *(Video_Width/Video_Height) --> (Video_Width/Video_Height) / (GL_Width/GL_Height) --> Video_Ritio/GL_RitioCopy the code

As you can see, we don’t need to figure out what the actual value is, we can automatically infer the zoom ratio in the code based on the viewport and the original width and height of the video frame.

Of course, you need to judge the specific situation, and there are four cases:

1. Viewport width > Height, and video aspect ratio > viewport aspect ratio: Zoom height (Video_Ritio/GL_Ritio) 2. Viewport width > height, and the video aspect ratio < viewport aspect ratio: Zoom width (GL_Ritio/Video_Ritio) 3. Viewport width < height, and video aspect ratio > Viewport aspect ratio: Zoom height (Video_Ritio/GL_Ritio) 4. Viewport width < height, and video aspect ratio < viewport aspect ratio: Zoom width (GL_Ritio/Video_Ritio)Copy the code

The above example falls into the third category.

The rest of the derivation is no longer, interested in their own push, deepen understanding.

Now let’s see how this is implemented in code.

Corrected picture proportion

IDrawer added two new interfaces for setting the original width and height of the video and setting the width and height of the OpenGL window

interface IDrawer {

    // Set the original width and height of the video
    fun setVideoSize(videoW: Int, videoH: Int)
    // Set the OpenGL window width and height
    fun setWorldSize(worldW: Int, worldH: Int)
    
    fun draw(a)
    fun setTextureID(id: Int)
    fun getSurfaceTexture(cb: (st: SurfaceTexture) - >Unit) {}
    fun release(a)
}
Copy the code

The VideoDrawer correction process is as follows:

class VideoDrawer: IDrawer {

    / /...

    private var mWorldWidth: Int = -1
    private var mWorldHeight: Int = -1
    private var mVideoWidth: Int = -1
    private var mVideoHeight: Int = -1
    
    // Coordinate transformation matrix
    private var mMatrix: FloatArray? = null
    
    // Matrix transform receiver
    private var mVertexMatrixHandler: Int = -1
    
    override fun setVideoSize(videoW: Int, videoH: Int) {
        mVideoWidth = videoW
        mVideoHeight = videoH
    }

    override fun setWorldSize(worldW: Int, worldH: Int) {
        mWorldWidth = worldW
        mWorldHeight = worldH
    }

    override fun draw(a) {
        if(mTextureId ! = -1) {
            // add 1: initializing matrix method
            initDefMatrix()
            // step 2: Create, compile, and start the OpenGL shader
            createGLPrg()
            // step 3: Activate and bind the texture unit
            activateTexture()
            // [Step 4: Bind image to texture unit]
            updateTexture()
            // [Step 5: Start rendering]
            doDraw()
        }
    }
    
    private fun initDefMatrix(a) {
        if(mMatrix ! =null) return
        if(mVideoWidth ! = -1&& mVideoHeight ! = -1&& mWorldWidth ! = -1&& mWorldHeight ! = -1) {
            mMatrix = FloatArray(16)
            var prjMatrix = FloatArray(16)
            val originRatio = mVideoWidth / mVideoHeight.toFloat()
            val worldRatio = mWorldWidth / mWorldHeight.toFloat()
            if (mWorldWidth > mWorldHeight) {
                if (originRatio > worldRatio) {
                    val actualRatio = originRatio / worldRatio
                    Matrix.orthoM(
                        prjMatrix, 0,
                        -1f.1f, 
                        -actualRatio, actualRatio, 
                        -1f.3f)}else {// The original scale is smaller than the window scale, the scale height will cause the height to exceed, therefore, the height is based on the window, the scale width
                    val actualRatio = worldRatio / originRatio
                    Matrix.orthoM(
                        prjMatrix, 0,
                        -actualRatio, actualRatio,
                        -1f.1f,
                        -1f.3f)}}else {
                if (originRatio > worldRatio) {
                    val actualRatio = originRatio / worldRatio
                    Matrix.orthoM(
                        prjMatrix, 0,
                        -1f.1f,
                        -actualRatio, actualRatio,
                        -1f.3f)}else {// The original scale is smaller than the window scale, the scaling height will cause the height to exceed, therefore, the height is based on the window, the scaling width
                    val actualRatio = worldRatio / originRatio
                    Matrix.orthoM(
                        prjMatrix, 0,
                        -actualRatio, actualRatio,
                        -1f.1f,
                        -1f.3f)}}}}private fun createGLPrg(a) {
        if (mProgram == -1) {
            // omit loading shader code
            / /...
            
            // add 2: get matrix variables in vertex shader
            mVertexMatrixHandler = GLES20.glGetUniformLocation(mProgram, "uMatrix")
            
            mVertexPosHandler = GLES20.glGetAttribLocation(mProgram, "aPosition")
            mTextureHandler = GLES20.glGetUniformLocation(mProgram, "uTexture")
            mTexturePosHandler = GLES20.glGetAttribLocation(mProgram, "aCoordinate")}// Use the OpenGL program
        GLES20.glUseProgram(mProgram)
    }
    
    private fun doDraw(a) {
        // Enable vertex handles
        GLES20.glEnableVertexAttribArray(mVertexPosHandler)
        GLES20.glEnableVertexAttribArray(mTexturePosHandler)
        
        // [add 3: pass transformation matrix to vertex shader]
        GLES20.glUniformMatrix4fv(mVertexMatrixHandler, 1.false, mMatrix, 0)
        
        // Set the shader parameter. The second parameter represents the amount of data a vertex contains, which is xy, so 2
        GLES20.glVertexAttribPointer(mVertexPosHandler, 2, GLES20.GL_FLOAT, false.0, mVertexBuffer)
        GLES20.glVertexAttribPointer(mTexturePosHandler, 2, GLES20.GL_FLOAT, false.0, mTextureBuffer)
        // Start drawing
        GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0.4)}private fun getVertexShader(a): String {
        return "attribute vec4 aPosition;" +
                // [add 4: matrix variable]
                "uniform mat4 uMatrix;" +
                "attribute vec2 aCoordinate;" +
                "varying vec2 vCoordinate;" +
                "void main() {" +
                // [add 5: coordinate transformation]
                " gl_Position = aPosition*uMatrix;" +
                " vCoordinate = aCoordinate;" +
                "}"
    }
    / /...
}
Copy the code

[add x:]

As you can see, the vertex shader has changed, adding a matrix variable, and the final displayed coordinates are multiplied by the matrix.

uniform mat4 uMatrix;
gl_Position = aPosition*uMatrix;
Copy the code

In the code also by OpenGL method to obtain the shader matrix variables, and calculate the scaling matrix, passed to the vertex shader.

By multiplying two matrices aPosition and uMatrix, the correct display position of pixels is obtained.

As for the principle of scaling, has been explained in the above, not in detail, only about the near plane and far plane Settings.

The z coordinate of our vertex coordinates is set to 0, and the default position of the camera is also at 0. In order for the vertex coordinates to be included in the clipping space, NEAR must <=0, far must >=0, and cannot both be equal to 0, i.e. near! = far.

Note: Both NEAR and FAR are relative to the camera coordinate points. For example, NEAR = -1, the actual z-coordinate of the near plane is 1, FAR = 1, and the z-coordinate of the far plane is -1. The Z axis is perpendicular to the phone screen.

See how the external calls:

class SimpleRender(private val mDrawer: IDrawer): GLSurfaceView.Renderer {

    / /...
    
    override fun onSurfaceChanged(gl: GL10? , width:Int, height: Int) {
        GLES20.glViewport(0.0, width, height)
        // Set the OpenGL window coordinates
        mDrawer.setWorldSize(width, height)
    }
    
    / /...
    
}
Copy the code

class OpenGLPlayerActivity: AppCompatActivity() {

    / /...
    
    private fun initRender(a) {
        drawer = VideoDrawer()
        // Set the video width and height
        drawer.setVideoSize(1920.1080)
        drawer.getSurfaceTexture {
            initPlayer(Surface(it))
        }
        gl_surface.setEGLContextClientVersion(2)
        gl_surface.setRenderer(SimpleRender(drawer))
    }
    / /...
    
Copy the code

At this point, a nice picture can finally be displayed normally.

Change camera position

As mentioned above, OpenGL can set the position and orientation of the camera, but in fact, the code above does not set this, because the camera defaults to the origin position. Now, let’s look at another way to set the near and far plane.

/**
 * Defines a viewing transformation in terms of an eye point, a center of
 * view, and an up vector.
 *
 * @param rm returns the result
 * @param rmOffset index into rm where the result matrix starts
 * @param eyeX eye point X
 * @param eyeY eye point Y
 * @param eyeZ eye point Z
 * @param centerX center of view X
 * @param centerY center of view Y
 * @param centerZ center of view Z
 * @param upX up vector X
 * @param upY up vector Y
 * @param upZ up vector Z
 */
public static void setLookAtM(float[] rm, int rmOffset,
            float eyeX, float eyeY, float eyeZ,
            float centerX, float centerY, float centerZ, 
            float upX, float upY, float upZ)
Copy the code

(eyeX, eyeY, eyeZ) determines the position of the camera, (upX, upY, upZ) determines the direction of the camera, and (centerX, centerY, centerZ) is the origin of the image, generally (0,0,0).

// Set the camera position
val viewMatrix = FloatArray(16)
Matrix.setLookAtM(viewMatrix, 0.0f.0f.5.0 f.0f.0f.0f.0f.1.0 f.0f)
Copy the code

The above set the camera position at z base 5 from the origin. The camera is pointing up the Y-axis and facing the XY plane.

Thus, if the z-axis of the vertex coordinates is still 0, the positions of the near and far planes must be reset for the picture to be included in the clipping space.

Such as:

Matrix.orthoM(mPrjMatrix, 0, -1.1.3.555556, -3.555556.1.6)
Copy the code

Near = 1, far = 6

The camera is located at z axis 5. So in order to include the point z=0, the near plane cannot be greater than 5 from the camera point, and the far plane cannot be less than 5 from the camera point. Similarly, near! = far.

Three, video filter

You’ll find filters in many video apps that change the style of the video. So how do these filters work?

In fact, the principle is very simple, nothing more than to change the color of the picture.

Let’s implement a very simple filter: black and white

Simply change the slice shader:

private fun getFragmentShader(a): String {
    // Be sure to add a newline "\n", otherwise it will be mixed with the precision on the next line and cause compilation errors
    return "#extension GL_OES_EGL_image_external : require\n" +
            "precision mediump float;" +
            "varying vec2 vCoordinate;" +
            "uniform samplerExternalOES uTexture;" +
            "void main() {" +
            " vec4 color = texture2D(uTexture, vCoordinate);" +
            "Float gray = (color.r + color.g + color.b)/3.0; +
            "Gl_FragColor = vec4(gray, gray, gray, 1.0);" +
            "}"
}
Copy the code

Key code:

vec4 color = texture2D(uTexture, vCoordinate);
float gray = (color.r + color.g + color.b)/3.0;
gl_FragColor = vec4(gray, gray, gray, 1.0);
Copy the code

Take a simple average value of RGB and assign it to all RGB to get a black and white color. Then assign to the slice, and a simple black and white filter is done. So easy ~

Of course, many filters are not that simple. Take a look at the filters implemented by others, such as OpenGL ES.

4. Refer to the article

OpenGL learning series – Coordinate system

OpenGL Learning series – Projection matrix

OpenGL Learning Footprint: Projection matrix and viewport transformation matrix