This article is actually a reading note for a book called Unity Shader Essentials. The rendering pipeline in this book is very clear and easy to understand. It is a good introduction to Shader. I know that a good memory is not as good as a bad pen, so I will combine the relevant content with some of my own understanding to write this blog down.

We call the process of image drawing the rendering pipeline, which is completed by the COOPERATION of CPU and GPU. Generally, a rendering process can be divided into three conceptual stages: Application Stage, Geometry Stage, and Rasterizer Stage.

Application stage

The application stage is carried out in THE CPU, the main task is to prepare the scene data, set up the rendering state, and then output the rendering pixel, that is, to provide the required geometric information for the next stage. What is a primitive? Primitives refer to basic graphics rendered. Primitives can be vertices, line segments, triangles, etc. Complex graphics can be realized by rendering multiple triangles. The application stage can be divided into three sub-stages

  1. Load data into video memory. All the data needed for rendering needs to be loaded from the hard disk into system memory (RAM), and then data such as mesh and texture is loaded into video memory (VRAM). This is because video cards have faster access to video memory, and most don’t have direct access to RAM.
  2. Set the render state. For example, set the shader, material, texture, light source properties to use.
  3. Call Draw Call. Draw Call is a command that is initiated by the CPU and received by the GPU. This command only points to a list of primiples that need to be rendered and does not contain any material information, because we set this up in the previous stage. When given a Draw Call, the GPU computes based on the render state and all the input vertex data, resulting in those nice pixels displayed on the screen.

The geometric phase

The geometry stage is carried out on GPU, and the main task is to output the vertex information of screen space. The geometry stage is used to process the geometry data of the object to be drawn received from the previous stage (which can be understood as the list of primions pointed to by the Draw Call), working with each rendering primiion, vertex by vertex, polygon by polygon. An important task in geometry stage is to transform vertex coordinates into screen space and then hand over to rasterizer for processing. After multi-step processing of the input pixel, this stage will output the two-dimensional vertex coordinates of the screen space, the corresponding depth value of each vertex, shading and other related information.

Rasterization stage

This phase is also performed on the GPU and will use the data passed in the previous phase to generate pixels on the screen and output the final image. The task of rasterization is to determine which pixels of each rendered pixel should be drawn on the screen. It requires interpolation of per-vertex data (texture coordinates, vertex colors, etc.) from the previous stage, followed by per-pixel processing. It can be understood that the geometric stage only gets the information related to the vertex of the graph element. For example, for the triangle graph element, the coordinate and color information of the three vertices are obtained. What the rasterization phase does is to calculate which pixels are covered by the triangle based on the three vertices, and to calculate their colors for these pixels by interpolation.

GPU rendering pipeline (geometry stage and rasterization stage)

Vertex shader

Vertex shaders are processed in units of vertices and are called once for each input vertex. The vertex shader itself cannot create or destroy any vertices, and there is no way to get the relationship between vertices, such as whether two vertices belong to the same triangle grid. However, due to such mutual independence, GPU can use its own characteristics to process each vertex in parallel, which means that the processing speed of this stage is very fast. Vertex shaders perform major tasks: coordinate transformation and vertex lighting.

Vertex shaders must perform coordinate transformations of vertices and can also compute and output vertex colors as needed. For example, we might need to do vertex by vertex lighting. A coordinate transformation is just some transformation of the coordinates of the vertices. The vertex shader can change the vertex position in this step, which is very useful in vertex animation. No matter how we change vertex positions in a vertex shader, one of the tasks that a basic vertex shader must perform is to convert vertex coordinates from model space to homogeneous clipping space.

After converting the vertex coordinates to a homogeneous clipping space, the hardware usually does perspective division to get the normalized device coordinates (NDC).

tailoring

The purpose of the clipping phase is to trim out the vertices that are not in the camera’s field of view, and to strip out the facets of certain triangular primitives (which are usually made up of smaller primitives). There are three kinds of relations between a pixel and the camera’s field of view: completely in the field of view, partially in the field of view, completely out of the field of view. Primitives that are completely in view are passed on to the next pipeline stage, and primitives that are completely out of view are not passed on because they do not need to be rendered. Those primitives that are partially in view need to be clipped. For example, if one vertex of a line segment is in the visual field and another is not, then the vertex outside the visual field should be replaced by a new vertex at the intersection of the line segment and the visual field boundary.

The screen mapping

The coordinates entered in this step are still the coordinates in the three dimensional coordinate system (within the unit cube). The task of screen mapping is to convert the X and y coordinates of each pixel to the screen coordinate system, which is actually a scaling process. The screen coordinate system is a two-dimensional coordinate system that has a lot to do with the resolution we use to display the picture.

Triangle setting

This stage computes the information needed to raster a triangular mesh. Specifically, the previous stage outputs all the vertices of the triangular mesh, but to get the pixel coverage of the entire triangular mesh, we have to calculate the pixel coordinates of each edge. In order to calculate the coordinate information of boundary pixels, we need to obtain the representation of triangle boundary. Such a process of calculating triangular mesh representation data is called triangular setup. Its output is to prepare for the next stage.

Triangle traversal

The triangle traversal phase checks whether each pixel is covered by a triangular mesh. If overwritten, a slice is generated. The process of finding which pixels are covered by the triangular mesh is called triangle traversal, also known as scan transformation. In the triangle traversal phase, which pixels are covered by a triangle mesh are judged according to the calculation results of the previous phase, and the pixels of the whole covered area are interpolated by the vertex information of the three vertices of the triangle mesh. Pixel and slice are one-to-one, each pixel will generate a slice, and the state in the slice records the information of the corresponding pixel, which is obtained by interpolation of the information of the three vertices.

Chip shader

The pixel shader is used to implement the pixel-by-pixel coloring operation, and the output is one or more color values (that is, the color of the pixel corresponding to the pixel is calculated, but not the final color). Many important rendering techniques can be performed at this stage, one of the most important being texture sampling. In order to carry out texture sampling in the slice shader, we usually output the texture coordinates corresponding to each vertex in the vertex shader stage, and then interpolate the texture coordinates corresponding to the three vertices of the triangle mesh through the rasterization stage, and then we can get the texture coordinates of the covered slice.

Bit by bit operation

The bit-by-bit phase is responsible for performing many important operations, such as changing colors, depth buffering, and blending. There are several main tasks in this phase

  1. Determines the visibility of each slice. This involves a lot of testing, such as deep testing, template testing, and so on.
  2. If a slice passes all the tests, the color value of the slice needs to be merged with, or blended with, the colors already stored in the color buffer.

A slice can be written to the color buffer only after it has passed all the tests before it is mixed with the colors of the pixels already in the color buffer.

Template test

Template testing, which can be used as an aid method for discarding pieces, is associated with template buffering. If open the template test, GPU will first read read mask (using) the stencil buffer in the location of the template value, then the value and read to read mask (using) the reference value of comparison, the comparison function can be specified by the developer, for example, when less than give up the yuan, or greater than or equal to give up. If the item fails this test, the item is discarded. Regardless of whether a slice has passed the template test or not, we can modify the template buffer based on the template test and the following in-depth test results, and this modification is also specified by the developer. The developer can set the modification action for different results, for example, leaving the template buffer unchanged on failure, increasing the value of the corresponding position in the template buffer by one on pass, and so on. Template tests are often used to limit the area of rendering. There are also more advanced uses of template testing, such as rendering shadows, contour rendering, etc.

The depth of the test

If depth testing is enabled, the GPU will compare the depth value of this element to the depth value that already exists in the depth buffer. This comparison function is also set by the developer. Usually if the depth value of this slice is greater than or equal to the value in the current depth buffer, it is discarded. Because we always want to show only the objects closest to the camera, those obscured by other objects don’t need to appear on the screen. If the item fails this test, the item is discarded. Unlike template tests, if a slice fails the depth test, it does not have the right to change the value in the depth buffer. If it passes the test, the developer can also specify whether the depth value of the slice should override the original depth value, by turning on/off the depth write.

hybrid

Why mix? Rendering is the process of drawing one object after another onto the screen. The color information for each pixel is stored in a place called a color buffer. Therefore, when we perform this rendering, we usually have the color buffer since the last rendering, so do we use this rendering to completely override the previous rendering, or do we do something else? That’s what mixing is all about. For opaque objects, developers can turn off blending. But for opaque objects, we need to use blending to make the object appear transparent.