Chestnuts from concave the temple qubit commentary | QbitAI number

How do you make a little sister your own?

Turn her into a two-dimensional human, and you unlock a wife.

NCSOFT, a South Korean gaming company, recently opened source a sophisticated AI.

As long as you type in the selfie of little sister, you can get her appearance in quadratic:

Compared to the original picture, it feels like the little sister is still the little sister.

A look, a smile, are three dimensional appearance has not changed.

Of course, if you have a favorite two-dimensional wife and want to see what she would look like in reality, that’s fine. Just type in a picture of her:

You get a lifelike little sister.

The algorithm is called U-Gat-it, which is also alluring. Importantly, it is trained in an unsupervised way, with no need for even the right data.

The team has now put both the TensorFlow implementation and the PyTorch implementation on GitHub. The two projects were on the trend list together, and THE TF project once topped the list.

Before you eat, take a look at what kind of AI can give you such abundant benefits:

This GAN has a different focus

U-gat-it is a graph-to-graph translation algorithm composed of two Gans.

A GAN, to the girl’s selfie, into a two-yuan little sister. This is a translation from the source domain to the target domain.

The other GAN is going to turn the two-dimensional little sister back into a three-dimensional selfie. This is a translation from the target domain to the source domain.

Thus, there are two sets of generator & discriminator combinations.

Generator is responsible for generating realistic false graph to deceive the discriminator; The discriminator is responsible for detecting false graphs. Each other grows together.

To create a more realistic image, the team added different attention to the two and four parts.

The specific method was inspired by the CAM research of Zhou Bolei’s team in 2016.

CAM, short for class activation diagram. It identifies the areas that are most important for determining whether an image is real or fake, and the AI can then focus its attention there.

Except for the up-sampling part, CAM uses global average pooling. U-gat-it combines global average pooling and maximum pooling for better results.

Here, take the first GAN, the GAN that generated the quadratic sister. First look at the discriminator:

It determines whether an image belongs to the same category as a quadratic girl in the dataset. If it’s not, it’s a generator.

The discriminator has an auxiliary classifier (CAM) that finds regions that are more important for category judgment.

This also directs the generator to focus on the important areas.

Now look at generators:

Its auxiliary classifier will find the most important areas for the three-dimensional girl. Then, by comparing the important areas of the two domains, the attention module knows where to focus the generator’s attention.

The second GAN, just in the opposite direction, does the same thing.

To combine two gans, the loss function is also carefully designed:

The loss function has four parts

One is against loss, not much explanation, every GAN has it.

The second is cyclic loss, which prevents the generator and discriminator from Mode Collapse after they have reached some kind of equilibrium.

A Cycle Consistency constraint is applied to the generator to ensure that the image generated for the target domain is recognized back to the source domain.

Third, identity loss. In order to ensure that the color distribution of input image and output image is similar, an identity consistency constraint is applied to the generator.

Specifically, if you pick an image from the target domain and translate it from the source domain to the target domain, nothing should change.

The fourth is CAM loss, given a graph activation graph, the generator and discriminator will know where they need to improve. In other words, what are the biggest differences between the two fields right now.

In addition, U-Gat-it has another important contribution:

AdaLIN can be normalized

IN general, Instance Normalization (IN) is a more commonly used method of Normalization, which normalizes the feature statistics of images to eliminate Style Variation.

In contrast, batch normalization (BN) and layer normalization (LN**) are less commonly used.

However, when normalizing pictures, adaptive IN, or AdaIN for short, is more common.

But here, the team came up with AdaLIN, which can dynamically choose between IN and LN.

With it, the AI can flexibly control how much shape and texture change.

Previous attention-based models failed to address geometric variations between fields;

U-gat-it, however, can do translations that require Holistic Changes as well as Large Shape Changes.

Finally, let’s talk about data sets.

Without supervision, no pair is made

Selfie2anime, there are two data sets.

One is a selfie dataset, and the other is a quadratic metadata dataset, both of which only selected girls.

3,400 in the training set and 100 in the test set. There’s no pairing.

In fact, not only that, there are horses into zebras, cats into dogs, photos into van Gogh painting style and so on, training a variety of functions.

Take a look at the results:

The effect is far better than the predecessors

U-gat-it (B) was pitted against many of its formidable predecessors:

CycleGAN (C), UNIT (D), MUNIT (e), DRIT (F).

deltaIn the fourth line, photographs become portraits; The fifth line is to change the painting style of Van Gogh

Reverse generation, such as quadratic to cubic, zebra to horse, etc., can also be:

Column (b) is the protagonist of this paper, whose performance is obviously better than that of all predecessors in the task of crossing dimensions. In other tasks, the generation effect is also better than its predecessors.

Then, let’s see if the attention module (CAM) really works.

The two columns on the right show the difference. (e) attention, (f) lack of attention:

Finally, observe the effect of AdaLIN, which can choose the normalization method dynamically, compared to that of AdaLIN, which cannot choose.

(b) is AdaLIN, and the four columns on the right are the normalized methods of accompany running (and various combinations of normalization) :

The AdaLIN results are more complete and less flawed.

In so doing, U-Gat-it succeeds in every way.

A burst of ecstasy, come to get the open source code.

This is TensorFlow version, once appeared trend list first (is now the third) : https://github.com/taki0112/UGATIT

This is the PyTorch version: https://github.com/znxlwm/UGATIT-pytorch

This is the paper: https://arxiv.org/abs/1907.10830