This is the fourth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021

AI face swapping is becoming more and more popular. There are videos of face swapping everywhere, one or the other. There are several face-changing apps out there, but I used Deepfacelab, which is based on deep learning and can replace a face in a video with the one you want. It’s cool to think about it, but it depends on the hardware of the computer, such as the graphics card. And the recent fluctuation of the price of graphics card is relatively large, the main can not afford to buy, so take the game notebook (GTX1060) to do a simple example.

The AI uses Generative Adversarial Networks (GAN) artificial intelligence algorithms to change faces. It first encountered game theory studied by Von Neumann, but I won’t go into it further here.

download

  1. Author GITHUB: github.com/iperov/Deep…

Current version

  • DirectX12, AMD: Support DirectX12 and A card
  • GTX10 and 20: supports 10 series and 20 series N cards
  • RTX30: supports 3000 series N cards

The installation

  1. Download the corresponding version (TEST computer is GTX 1060 graphics card)

  2. Download the compressed package and decompress it

Directory file

  1. ==_internal== Stores source-related content

  2. ==workspace== Working directory

  3. ==. Bat == Script batch file

Example of AI face change

Note: The video adopts its own video

General execution steps:

  1. Source video to image (source image)
  2. Target video to image (Target image)
  3. Extract the source face information from the source image
  4. Extracting target face information from target image
  5. Head training Model using face Information
  6. Use the model to change the picture face
  7. Export the composite video of the changed face

Step1 turn the source video into a picture

  1. Extract images from video data_src.bat==

    1. [0] How many slices per second (? : help) : 10
      • It means how many cards per second. FPS [Frames Per Second] refers to the frame rate of a video. Commonly, the frame rate of a video is 24, 30, 60, etc., that is, there are several pictures Per Second.
      • Input 10 indicates extracting 10 pictures per second; Press Enter. The default value is 30, that is, 30 images are extracted in one second
    2. [PNG] Output image format (PNG/JPG? :help ) : jpg
      • The Format of the extracted image can be PNG or JPG.
      • PNG is a bitmap format with lossless compression algorithm.
      • JPG uses lossy compression to remove the redundant data of the image, there is a certain acceptable range of distortion.
  2. After the program is processed, the images extracted from the source video will appear in the ==workspace\data_src== directory.

Step2: Turn the target video into a picture

  1. Extract images from video data_dst FULL fps.bat ==

    The same as Step1, but the target video (data_dst) must not drop a frame, just set the image format

  2. After the program is processed, the images extracted from the source video will appear in the ==workspace\ datA_dst == directory.

Step3: extract the source face information from the source picture

  1. == faceset extract. Bat ==

    6 parameters, generally directly enter the default, the first two steps are relatively fast, this step needs to wait, the specific speed depends on the computer configuration. After the beginning of extraction, a progress bar will appear at the bottom. When the progress bar reaches 100%, the extraction is complete, and the number of pictures and extracted faces will be displayed.

  2. After the process is complete, a face image extracted from the source image will appear in the ==workspace\data_src\aligned== directory.

Step4: Extract the target face information from the target picture

  1. ==data_dst faceset extract. Bat ==

    Do the same as Step4. After the extraction begins, a progress bar will appear at the bottom. When the progress bar reaches 100%, the extraction is completed.

  2. Workspace \data_dst== in addition to the photos from the target video, there is also a folder [aligned] [aligned_debug] in the ==workspace\data_dst== directory.

    1. The ==workspace\data_dst\aligned== directory stores the faces extracted from the target images.

      After extraction, we need to screen it. Sometimes there are more than one person in one frame of the image, so for accuracy, we need to remove extra faces or faces that are fuzzy.

    2. Workspace \ datA_dst \aligned_debug== stores the images marked with identification

      • Red is the area where the head was taken
      • Blue is the facial area
      • Green is the outline of the face (the main feature)

Step5: Use face information head training Model (Model)

  1. Double click to run batch file ==train 96.bat==

    Enter the model name, select the device, generally select the graphics card. Once the selection is complete, the program automatically initializes the model, loads samples, and displays model parameters.

    Wait for the startup. After the startup, the preview window is automatically displayed and the following parameters are displayed on the CLI

    Just focus on iteration times, SRC losses, and DST losses

    • The more iterations, the better
    • The lower the loss of SRC and DST, the better

    The preview window contains operating tips, loss curves, and face areas.

    The face area is divided into five columns. 【 source face | | generated algorithm to generate | | target face algorithm algorithm to generate | algorithm to generate 】

    With the increase of the number of training iterations, the algorithm will slowly generate the contour of the face, facial features, and then slowly become clear.

  2. In the preview window, we can use the keyboard (switch English input method) for operation: P for refresh, Enter for stop, S for save.

    We just need to see if the second column is infinitely similar to the first column, the fourth column is infinitely similar to the third column, and the expressions in the fifth column are infinitely similar to the fourth column.

    When the picture of all columns is clear enough, you can stop.

Step6: Use the model to change the picture face

  1. Merge quick96.bat ==

    Select model, device, start interactive composition, CPU thread count (usually less than or equal to 8), press Enter.

    Wait a moment, the help screen pops up.

    The help screen shows the shortcut keys and functions that we can do.

  2. Press Tab to pop up the compositing preview interface

  3. Adjust by shortcut keys

    • Press E to increase the degree of eclosion, press D to decrease the degree of eclosion

    • Key switch to previous frame, key switch to next frame

    • When you’re done, press Shift +? Apply parameters to all frames

    • Press Shift +> to start autocomposition

  4. Wait until the synthesis progress reaches 100%, that is, the synthesis is completed, manually close the window

  5. In = = workspace \ data_dst = = directory appeared two folders 【 merged | merged_mask 】

    Merged directory stores merged images that have been replaced

Step7: Export the composite video of the changed face

  1. Merged to mp4.bat==

    The software will automatically read the configuration information of the source video, such as frame rate and audio track. We only need to set the output bit rate, which is generally set to 3.

  2. Wait for the synthesis to complete

  3. The resulting video (result.mp4) will appear in the ==workspace== directory, (result_mask.mp4 is a masked video for later use).

  4. Compare the video


Technology itself is not good or bad, it is how it is used!!