Blog: machenshuang. Making. IO /

In 2019, in July of last year, The Aging feature of FaceApp in Russia was made popular by a bunch of American politicians and celebrities, and then received a privacy ban from the United States. Technically, FaceApp aging should be based on machine learning. Today, I’m going to share with you an aging effect based on a traditional algorithm.

Let me show you the special effects, as shown in Figure 1.1:




Figure 1.1 Aging effects

Let’s explore the implementation process of this effect. Visually, its essence is to affix a wrinkle map to the position of the face on the user map, with specific ideas as follows:

  1. Face detection is carried out on picture A to obtain the face key points set M of picture A.
  2. According to the features of the key points set M of face detection, the wrinkle graph B is dotted, and the point set is N.
  3. There are various methods to align the key points set N to M, such as triangulation or similarity transformation in moving least square method.
  4. The new vertex coordinate set V, texture coordinate set T and index set I were obtained through alignment. The effect picture C was obtained by mixing and superposing picture A and wrinkle B.
  5. Mix the transparency of picture A and effect picture C to achieve the effect of adjusting the transparency.

Now that we’ve seen the key steps to achieve this effect, let’s take a closer look at each step.

I. Key point detection

Face detection technology is not a new thing, the major manufacturers out of a lot of face detection SDK, 81 key points, there are 106 key points, at present in charge of the more famous face key point SDK has sentang technology, sight-seeing, byte, etc.; Free face keypoint SDKS include OpenCV, Dlib and MLVision of Firebase under Google. The author uses it to do face keypoint detection this time. MLVision can detect 133 key points, including the forehead point of the face, as shown in Figure 2.1:




Figure 2.1 Google Keypoints sample diagram

Firebase MLVision’s specific usage can be referred to Firebase’s documentation, which can only be accessed by ladder. Here, the author briefly summarizes the usage experience of Firebase MLVision:

  1. Advantages:
  • API design friendly, free to use, can detect forehead points, facial features, etc.;
  • Key points are stable without too much shaking.
  1. Inadequate:
  • It can only detect single face, and the key point detection of single frame of face is too slow compared with charging manufacturers. For a 1080 picture, MLVision needs 15~20ms, Face++ is 6ms and byte is 3ms, so theoretically MLVision cannot be used in real-time scenarios.
  • Lack of more face-related data, such as partial Euler Angle data.
  • In addition to obtaining the key point set M from user figure A through face detection, it is also necessary to dot wrinkle figure B, which is actually to manually calculate the coordinates of key points in Figure B. The number and features of coordinates are consistent with point set M, which is point set N. Wrinkle figure is shown in Figure 2.2:




Figure 2.2 Wrinkle mask

Second, key point alignment

In the image deformation, there are many methods in the industry, there are two-dimensional, there are three-dimensional deformation, the use of only two-dimensional deformation, two-dimensional image deformation has the moving least square method, including radial deformation, similar deformation, rigid deformation. A brief description of its principle is that the user specifies the control points in the image and drives the image deformation by dragging the control points. Assuming p is the position of the control point in the original image and q is the position of the control point after drag and drop, we use the moving least square method to construct the corresponding affine transformation LV (x) for each pixel point V on the original image, and calculate the position of the image after deformation through this transformation:

The expression of the weight wi for wi = 1/2 | | PI – v alpha, affine transformation lv (x) consists of two parts lv (x) = xM + T, which M for the linear transformation matrix, T is translation. In fact, the partial derivative of the minimized expression with respect to the variable T yields the expression T = q* -pm, where P = ∑wipi/∑wi, and Q * = ∑wiqi/∑ WI. So the affine transformation can be simplified to lv(x) = (x-p *)M + q*, and the minimization expression can be changed to:

Among them

Affine deformation is obtained by directly solving the minimization expression with classical normal equation:

With the expression of the rotation matrix M, we get the expression of the deformation:

Since the user realizes the image deformation by controlling the position of Q, while the position of P is fixed, most of the content in the above formula can be pre-calculated and saved to improve the computing speed, and the deformation expression is rewritten as follows:

The effect picture is shown in Figure 3.1:




Figure 3.1 MSL similarity transformation effect diagram

In this step, according to point set M and point set N, line wrinkle graph B to user graph A, and the effect is shown in Figure 3.2:




Figure 3.2 Mask alignment diagram

Three, stacking and mixing

Once the mask picture B is deformed, it needs to be superimposed and mixed with picture A. There are many kinds of superimposed algorithms, including soft light, superposition, strong light, brightening and darkening, etc. For details, you can check Photo Shop. If there is no Photo Shop, you can check some mobile phone Photo editing tools, such as PicsArt and Meyi. Here, the author suggests that soft light mixing or superposition mixing be used. The superposition mixing used here has basically the same effect as soft light, but it will be better for black people to experience aging. Various mixing formulas are shown in FIG. 4.1:




Figure 4.1 Superposition formula

Here, the author uses GPUImage3’s overlay hybrid shader, which is based on Metal’s GPU rendering. Metal, apple’s exclusive API for GPU rendering and calculation in the future, Both graphics and computing functions, low-level, low overhead hardware acceleration, similar to OpenGL and OpenCL functions in the same API, iOS8 above, but in fact, it is safe to use iOS11 above, related to hardware acceleration, in addition to Apple also has a MetalKit, and Metal co-exist in iOS. MetalKit has a few more features than Metal, such as an easier texture loading API, Metai efficient INPUT/output, MTKView, etc. Basically, MetalKit is used on iOS.

After using Metal rendering, the effect picture is shown in Figure 4.2:




FIG. 4.2 Aging process diagram

This is basically the end of the aging effect implementation, the Demo can be downloaded directly on Github.

reference

  1. Cartoon image deformation algorithm (Moving Least Squares) attached source code
  2. Firebase face detection guide document