These AI algorithms are so strong, I have a bold idea

I haven’t recommended any interesting algorithms for a long time. Today, Jack will show you around the “AI Paradise” to see what new and interesting AI algorithms have been released recently.

I. Depth estimation

I don’t know if you noticed, but the latest visual technology research, a lot of 3D related work.

Depth estimation is one of them.

A new study from Facebook can reliably estimate the depth of an image based on a sequence of video frames.

With image depth information, you can do many interesting video effects:

“Water overflowing jinshan”, “snow flying all over the sky”, “Venus around”.

In this paper, we propose an algorithm to reconstruct all pixel density and geometric consistency depth estimation in monocular video.

Compared with the previous monocular depth estimation methods, the accuracy is higher and the result is more stable.

Open source code, disk it!

Project address: github.com/facebookres…

Second, the Wav2Lip

AI technology can boost the world of ghost video.

Wav2Lip, as the name suggests, Wav audio turn Lip Lip.

It literally means, give the algorithm an audio file, and the algorithm will tell the character in the video to say the words naturally.

Any character identity, even cartoon characters, any voice and language, can sync lip sync video with high precision to any target speech.

The man even created a web Demo for Xiao Bai to experience by uploading audio and video.

Web page address: bhaasha iiit. Ac. In/lipsync /

For those with some programming skills, go to GitHub and download the source code. Just take a photo or video of someone you want to ghost and type in a few words to create the video you want.

For example, a paragraph “Trump love China declaration”, is not suddenly inspired.

Oh, my God, I have so many bold ideas in my head.

There’s Wav2Lip, and there’s Lip2Wav.

We can think of it as AI lip-reading.

The sound of the video is gone, and Lip2Wav generates it for you.

According to the action of the lips in the picture, “lip-reading” will give you the audio results.

It is worth noting that Lip2Wav is not tuned the same way as the mechanical style of the station B.

The AI effects are so intense that you can hardly feel the voice of a machine, as if it were a human speaking.

The algorithm works by encoding lips based on facial features and using LSTM for audio synthesis.

Wav2Lip and Lip2Wav have open source code.

Go ahead, check the labels, order whatever you like.

Wav2Lip Project address: github.com/Rudrabha/Wa…

Lip2Wav project address: github.com/Rudrabha/Li…

Third, HiFiC

Those of you who have never made a website may not know.

The size of the image affects the loading speed of the page too much.

Too many images, too large, can cause your page to load, like an old lady eating a biscuit.

Still struggling with image loading?

The latest good news is that the Google team has adopted HiFiC, an image compression method combining GANs and neural network-based compression algorithm, which can restore the image with high fidelity even under the condition of high bit rate compression.

Here’s how this algorithm compares to JPG images.

On the left is the compression effect of HiFiC algorithm, on the right is the effect of JPG image of the same volume.

It is very obvious that the image compressed by HiFiC is much clearer.

The code is not open source yet, but the author says, “Soon, soon, this time!”

Want to experience friends, go directly to the web page!

Project address: hifi.github. IO /

Four,

After writing the article, it was already 1:00 in the morning, and my liver hurt in my head.

Do you have a comment like three company? This time definitely!

These AI algorithms are so strong, I have a bold idea

I. Depth estimation

Second, the Wav2Lip

Third, HiFiC

Four,

Related Posts

The current domestic reliable technical community

DiskCatalogMaker for MAC

INFO: PIP is looking at multiple versions of keras-preprocessing to determine which version is C