What is the GAN

GAN is a class of machine learning techniques consisting of two simultaneously trained models: one is a generator trained to generate pseudo-data; The other is a discriminator, which is trained to identify fake data from real data.

The word generative indicates the overall goal of the model — generating new data. The data generated by GAN learning depends on the training set chosen. For example, if we want to use GAN to synthesize a painting that looks like Da Vinci’s work, we use Da Vinci’s work as the training set.

The term Adversarial refers to the two dynamic game, competitive models that make up the GAN framework: generator and discriminator. The goal of the generator is to produce pseudo-data that is indistinguishable from the real data in the training set — in our example, this means being able to create paintings that look like Leonardo’s. The goal of the discriminator is to be able to distinguish what is real data from the training set and what is dummy data from the generator. That is, the discriminator acts as an expert in art appraisals, assessing the authenticity of works believed to be leonardo’s. The two networks are constantly pitting their wits against each other, trying to fool each other: the more realistic the fake data generated by the generator, the better the discriminator will be at telling the difference.

The term network refers to a class of machine learning models most commonly used for generators and discriminators: neural networks. Depending on the complexity of the GAN implementation, these networks range from the simplest feedforward neural networks (Chapter 3) to convolutional neural networks (Chapter 4) and more complex variants (such as U-Net in Chapter 9).

How does GAN work

The mathematics underpinning GAN are complex (we’ll focus on them in the next few chapters, especially Chapters 3 and 5), and fortunately, we have a number of real-world examples to draw analogies to make GAN easier to understand. Earlier we discussed an example of an art forger (generator) trying to fool an art authenticator (discriminator). The more realistic a forger makes a fake painting, the better the expert must be able to tell the difference. The reverse is also true: the better the appraisers are at determining whether a painting is genuine, the more forgers will have to improve their techniques to avoid being spotted.

Another metaphor is often used to describe GAN (an example often favoured by Ian Goodfellow), the counterfeiter (generator) and the detective trying to arrest him (discriminator) — the more authentic the fake bills look, the better the detective is required to spot them, and vice versa.

In more technical terms, the goal of a generator is to generate a sample that captures the characteristics of the training set as efficiently as possible, so that the resulting sample is indistinguishable from the training data. The generator can be thought of as an object recognition model in reverse — the object recognition algorithm learns patterns in the image in order to be able to recognize the content of the image. Instead of recognizing these patterns, the generator learns to create them from scratch, and in fact, the generator’s input is often nothing more than a random vector of numbers.

Generators continually learn by receiving feedback from the discriminator’s classification results. The goal of the discriminator is to determine whether a particular sample is true (from the training set) or false (generated by the generator). So every time the discriminator is “tricked” into misidentifying a fake image as a real one, the generator knows it’s doing a good job; Instead, each time the discriminator correctly identifies a fake image generated by the generator, the generator receives feedback that it needs to continue improving.

The discriminator also improves over time, and like other classifiers, it learns from the deviation between the predicted tag and the real tag (true or false). So both networks are improving at the same time, as generators get better at producing more realistic data, and discriminators get better at distinguishing between real and fake data.

Table 1.1 summarizes the key information of the two sub-networks of GAN.

 

Which is the field book?

 

GAN of actual combat

This book is designed to guide anyone interested in generative adversarial networks (GAN) from the ground up. This book starts with the simplest examples, introduces the implementation and technical details of some of the most innovative GAN’s, then provides an intuitive explanation of these advances and a complete presentation of everything involved (not including the most basic mathematics and principles) to make cutting-edge research at your fingertips.

The ultimate goal of this book is to provide you with the knowledge and tools necessary not only to fully understand what has been achieved with GAN to date, but also to be able to develop new applications of your choice. Generative antagonism is a model full of potential, waiting to be tapped by enterprising people like you who want to make a difference in academic research and practical application! Welcome to join us on our GAN journey.

Suits the crowd

This book is for those who already have some experience with machine learning and neural networks. Here’s a list of things readers should ideally know in advance. Although this book does its best to make it easy to understand, you should be confident about at least 70 percent of what you know below.

  • You do not need to be proficient in Python, but you should have at least two years of Python experience (preferably with a full-time data scientist or software engineer background).
  • Understanding object-oriented programming, how to use objects, and how to find out their properties and methods; Understand both typical Python objects (such as Pandas DataFrame) and atypical objects (such as the Keras layer).
  • Understand the fundamentals of machine learning theory, such as segmentation, overfitting, weights and hyperparameters of training sets and test sets, as well as supervised, unsupervised and reinforcement learning. Be familiar with accuracy and mean square error.
  • Knowledge of basic statistics and calculus, such as probability, density functions, probability distributions, differentiation, and simple optimization.
  • Understand basic linear algebra, such as matrices, higher dimensional Spaces, and should also understand the concept of principal component analysis.
  • Understand the basics of deep learning such as feedforward networks, weights and bias, activation functions, regularization, stochastic gradient descent, and back propagation.
  • Basic familiarity with or self-study of the Python-based machine learning library, Keras, is required.

This is not to sound alarmist, but to make sure you get the most out of what this book has to say. Sure, you can learn anyway, but the less you know before, the more you’ll have to search and learn online. If you feel comfortable with the above requirements, then start learning!

“GAN Actual Combat” code description

There are many examples containing source code in numbered lists or embedded in plain text. In both cases the source code is formatted in a monospaced font style. Sometimes code is highlighted in bold style to indicate that it is different from the code shown previously (with changes), such as adding new functionality to an existing line of code.

Most of the source code has been formatted to fit the layout of the pages. In addition, comments in the source code are usually removed if there is an explanation of the code in the text. The listings are annotated to highlight important concepts. The code for the examples in this book can be downloaded from matching Resources on the details page of the Asynchronous Community Book.

This book uses Jupyter Notebook, a standard tool for data science education, so you should learn how to use it first. This should not be difficult for intermediate Python learners. Accessing a GPU or getting all its functions up and running can sometimes be difficult, especially on Windows, so some chapters offer a Google Colaboratory (Colab for short) Notebook. It’s Google’s free platform, pre-packaged with the necessary data science tools and free Gpus for a limited time. You can run this code directly in your browser, or you can upload code from other sections to Colab — they’re compatible.

Online resources

GAN is an active area with good (albeit fragmented) resources. The more academically inclined can find the latest papers on the arXiv website. ArXiv is an online repository of electronic preprints of academic papers owned and operated by Cornell University.

The authors are active contributors to Medium (especially the technology-focused publications Towards Data Science and Hacker Noon), where you can find their latest content.

Structure of the book

This book seeks to strike a balance between theory and practice. The book is divided into three parts.

Part 1 Introduction to generative Adversarial Networks (GAN) and generative Models This part introduces the basic concepts of generative learning and GAN, and implements several typical variations of GAN.

  • In chapter 1, generative adversarial network (GAN) is introduced and its working principle is explained in high level. As you learn in this chapter, GAN is made up of two independent neural networks (generators and discriminators) that are trained through dynamic competition. Knowledge of this chapter provides a foundation for understanding the rest of the book.
  • Chapter 2 discusses autoencoders, which in many ways are seen as a precursor to GAN. Given the novelty of generative learning, adding this chapter helps put GAN in a broader context. This chapter presents the first code tutorial for building an autoencoder to generate handwritten numbers — the same task we will explore in the GAN tutorial in later chapters. If you are already familiar with autoencoders or want to explore GAN directly, skip this chapter.
  • Chapter 3 delves into GAN and the theory behind adversarial learning. This chapter explains the main differences between GAN and traditional neural networks, that is, the differences in cost function and training process are discussed. In the code tutorial at the end of this chapter, we’ll apply what we’ve learned to implement GAN in Keras and train it to generate handwritten numbers.
  • Chapter 4 introduces convolutional neural networks and batch normalization. This chapter implements an advanced GAN architecture that uses convolutional networks as its generators and discriminators and uses batch normalization to stabilize the training process.

The second part is the frontier topic of GAN

Based on the first part, this part delves into the basic theory of GAN and implements a series of high-level GAN architectures.

  • Chapter 5 discusses the many theoretical and practical barriers to GAN training and ways to overcome them. This chapter provides a comprehensive overview of preferred practices for training gans based on relevant academic papers and presentations. It also covers options for evaluating GAN performance and why you need to worry about this.
  • Chapter 6 explores the progressive Growth generative adversarial network (PGGAN), a leading edge training approach for generators and discriminators. PGGAN achieved very good image quality and resolution by adding new layers during training. This chapter presents real code examples and uses the TensorFlow Hub (TFHub) to explain how it works in theory and in practice.
  • Chapter 7 continues to explore innovations based on the core GAN model. You will see the huge practical implications of improving classification accuracy through semi-supervised learning using only a small number of labeled training samples. Use this chapter to implement a semi-supervised generative adversarial network (SGAN) and explain how it uses tags to transform a discriminator into a robust multi-class classifier.
  • Chapter 8 shows another GAN architecture for using tags in training. By using tags or other conditional information when training generators and discriminators, conditional generative adjoint networks (CgAns) address one of the main weaknesses of generators — the inability to explicitly specify samples to be synthesized. At the end of this chapter, a CGAN is implemented to directly view the generation of target data.
  • Chapter 9 discusses one of the most interesting GAN architectures: cyclic consistency Generative adversarial networks (CycleGAN). This technology can transform one image into another, such as a horse into a zebra. This chapter introduces the Architecture of CycleGAN, explains its main components and innovations, and uses CycleGAN to turn apples into oranges (or oranges into apples).

Part three: Where to go

This section discusses how and where to apply GAN and adversarial learning.

  • Chapter 10 introduces the adversarial sample. Adversarial samples are a technique that deliberately tricks machine learning models into making them wrong. This chapter discusses their theoretical and practical importance and explores their relationship with GAN.
  • Chapter 11 introduces the practical application of GAN, exploring how the techniques described in the previous chapters can be applied to practical use cases in medicine and fashion: in medicine, how GAN can be used to augment small data sets to improve classification accuracy; In the field of fashion, how GAN promotes the development of personalized customization.
  • Chapter 12 summarizes the major GAN gains to date, discusses the ethical considerations associated with GAN, and introduces some emerging GAN technologies.

Why learn GAN

Since its invention, GAN has been hailed as “one of the most important innovations in deep learning” by academics and industry experts. Yann LeCun, Head of ARTIFICIAL intelligence research at Facebook, went so far as to say that GAN and its variants are “the coolest idea in deep learning in the last 20 years.” [2]

The excitement is justified. Other advances in machine learning may be well known to researchers, but may be more confusing than exciting to the uninitiated, and GAN has sparked great interest from researchers to the general public — including the New York Times, the BBC, Scientific American, and many other prominent media organizations, It could even be one of GAN’s accomplishments that drove you to buy the book. (Right?)

Perhaps most noteworthy is GAN’s ability to create surrealist imagery. None of the human faces shown in Figure 1.4 are real and all are fake, demonstrating the GAN’s ability to synthesize images that look like real photos. The faces were generated using a progressive generative adversarial network, as described in Chapter 6.

 

(source: Progressive Growing of GAN for Improved Quality, Stability and Variation By Tero Karras et al., 2017.) Figure 1.4 These lifelike but false faces were generated by progressive GAN trained on high resolution celebrity portrait photo collections

Another notable achievement of GAN is image-to-image translation. Similar to how sentences are translated from Chinese to Spanish, GAN can transform images from one style to another. As shown in Figure 1.5, GAN can transform an image of a horse into an image of a zebra, and a photo into a Monet painting, with almost no supervision and no tagging required. The variant of GAN that makes all of this possible is cyclic consistency Generative adversarial networks (CycleGAN), which you’ll see in Chapter 9.

More practical GAN applications are also fascinating. Amazon, the online retail giant, has tried to use gans for fashion advice: by analysing countless combinations, the system can learn to generate new products that match any given style. [3] In medical research, GAN enhances data sets by synthesizing samples to improve diagnostic accuracy. [4] After we have mastered the details of training GAN and its variants, we will explore these two applications in detail in Chapter 11.

 

(source: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks, by Jun-Yan Zhu et al., 2017.) Figure 1.5 By using a GAN variant called CycleGAN, it is possible to turn a Monet painting into a photograph, or a zebra in an image into a horse; And vice versa

GAN is also seen as an important building block for achieving general artificial intelligence [5]. It is an artificial system that rivals human cognition and can acquire expertise in almost any field — from motor skills for walking to verbal skills and even creative skills for writing poetry.

However, the ability to generate new data and new images makes gans dangerous at times. Much has been said about the spread of fake news and its dangers, as has GAN’s ability to generate credible fake videos. At the end of a 2018 article on GAN — aptly titled “How to Be an ARTIFICIAL Intelligence” — New York Times reporters Cade Metz and Keith Collins talked about the alarming prospect: Gans can be used to create and spread credulous misinformation, such as fake video clips of world leaders making statements. Martin Giles, the San Francisco bureau chief of MIT Technology Review, also expressed concern that in the hands of skilled hackers, gans could be used to explore and exploit system vulnerabilities on an unprecedented scale, he wrote in his 2018 article “The Father of GAN: The Man Who Empowered Machines to Imagine.” These concerns lead us to discuss the ethical implications of GAN applications (Chapter 12).

GAN can bring many benefits to the world, but any technological innovation is a double-edged sword. We have to be philosophical about this: it’s impossible to “kill” a technology, so it’s important to make sure that people like you understand the rapid rise of this technology and its enormous potential.

This book has only scratched the surface of what can be done with GAN, but we hope it has provided you with the theoretical knowledge and practical skills necessary to continue exploring your areas of greatest interest in all aspects.

Without delay, let’s get started!