An overview of the

In the graphics rendering pipeline, a vertex coordinate, presumably, goes through the local coordinate system, the world coordinate system, the camera coordinate system, the clipping coordinate system, and finally to the window coordinate system, displayed on the screen.

In each of these processes, you have to do some transformation to go from one frame to the other. Below, I will describe how each transformation works.

Note that this article is for OpenGL.

Local space -> World space

This transformation process is mainly to place the model in the world space for a certain scale, rotation or translation. This step is relatively simple, just apply the corresponding matrix to the local space coordinates of the model.

For example, scaling the modelAnd then rotate about the Z axisDegree, and then proceedIn the translation. Note that the order of transformation is invariant, that is, scaling, rotation, and finally translation. Based on this, we can construct model transformation matrix.


World Space -> Camera space

First, define the camera:

  • Coordinate for

  • To observe the direction

  • Upward direction

The schematic diagram is as follows:

Note one property: ** When the camera transforms with the object it “sees”, the camera “sees” the same content. ** In this way, you can move the camera coordinates to the origin of the world coordinates, align the Y-axis of the world coordinates upward, and align the -z axis of the world coordinates in the observation direction. And then you do the same thing to the object.

In mathematics, the process goes something like this:

  • Move the camera to the origin of the coordinates
  • Rotation observation directionThe Z axis
  • Upward direction of rotationTo Y
  • Rotation () to the X axis

Roughly divided into two steps: first displacement, then rotation. namely.

Translation part:


For the rotation part, just a few things. For two dimensions:



By definition, rotationAngle and rotationThe angles are reciprocal, that is:.

Therefore, for the rotation transformation, it can be concluded that the inverse of the rotation matrix is equal to its transpose, namely:


Going back to the rotation section above, it is not very convenient to directly find the matrix of the rotation of the camera’s coordinate axis to the world axis, but conversely, it is easy to find the rotation of the world axis to the camera’s coordinate axis:


According to the inverse of the rotation matrix is equal to its transpose, it can be concluded that:


According to the, can be concluded:


Camera Space -> Cropping space

At the end of a vertex shader run, all coordinates are expected to fall within a specific range, and any points outside this range should be Clipped. The clipped coordinates are ignored, so the remaining coordinates become visible fragments on the screen. That’s where the name Clip Space comes from.

Because it’s not intuitive to specify all visible coordinates in the range of -1.0 to 1.0, we will specify our own Coordinate Set and transform it back to the standardized device Coordinate system.

The Viewing Box created by the projection matrix is called a Frustum, and each coordinate that appears within the Frustum range will eventually appear on the user’s screen. The process of converting coordinates within a specific scope to a standardized device coordinate system (and it is easy to map to 2D viewing space coordinates) is called Projection because a Projection matrix can be used to Project 3D coordinates into a standardized device coordinate system that can be easily mapped to 2D.

Note here that OpenGL is right handed, but in NDC, it is left handed, pay special attention here!!

Camera space conversion to cropping space, it is necessary to use projection transformation. There are two kinds of projection transformations: orthogonal projection and perspective projection. The following are introduced respectively.

Orthogonal projection

Let’s first define an orthogonally projected visual cone(Note that n and f are both negative numbers, f is the far plane, so f<n), it’s a cuboid. What we need to do is convert the orthogonally projected visual cone to a standard cube,).Notice, hereMaps to [1,-1] in the NDC.

Here, there are two steps: translation and scaling. The matrix of orthogonal projection is as follows:


Perspective projection

For perspective projection, there are two steps:

  • First, “squish” the visual cone into a cuboid (n-> N, F -> F) ();
  • Then, do the orthogonal projection operation (, the orthogonal projection above).

Look at the picture below:

According to the relationship of similar triangles, it can be concluded that:


Similarly, it can be concluded:


Thus, the following relationship can be obtained:


Now, the property of a homogeneous coordinate: in a 3D coordinate system,..They all represent the same coordinates. Such as:andThey represent coordinates.

Therefore, the relationship is as follows:


Further, we can get:


Now, we have a third column that we don’t know.

By observing the perspective projection cone above, the following inferences can be drawn:

  1. The coordinates of the points near the plane don’t change;

  2. A point in the far plane, the Z coordinate doesn’t change.

According to corollary 1, points near the planeWhen you transform it, it doesn’t change. That is:


According to:


becauseIt has nothing to do with either x or y, so we getThe third column of theta is of the form.

According to:


It can be concluded that:


According to corollary 2, the center of the far planeAfter the transformation, it is still itself. As follows:


Therefore, it can be concluded that:


That is:


Here, the system of equations can be derived:


So if I go here, I get this:


Finally, perspective projection matrix:


Cropping space -> Window space

At the end of the clipping space, all the visible points are in the standard equipment coordinate system (NDC), that is, the coordinates are located in the rangeInside.

Forget about the z-axis transformation.

From NDC to window space, viewport transformation is required. Define a screen space:. Coordinates in the lower left corner of the plane, the coordinates on the upper right corner are. For the transformation of X and Y coordinates, fromto.

Here, a two-step transformation:

  1. Shift the center of the NDC to the center of the window;


  2. Scale the NDC to the size of the screen.


Put together:


For the Z coordinate, fromMapped to. This is just a simple linear mapping. Assuming thatwhenAt minus 1,Equal to zero; whenAt 1,Is equal to 1. The following equations can be obtained:


So,. In the aboveMatrix, can be obtained:


reference

  • [1] GAMES101- Introduction to Modern Computer Graphics – Yan Lingqi

  • [2] OpenGL Projection Matrix

  • [3] Steve Marschner and Peter Shirley, “Fundamentals of Computer Graphics”