Today, when someone mentions ai’s potential for social change, they are probably talking about artificial neural networks in machine learning. When an article talks about a breakthrough in artificial neural networks, the author is probably referring to deep learning.

Artificial neural networks are non-linear statistical modeling tools that can be used to discover relationships between inputs and outputs, or to discover patterns in large databases. Artificial neural networks have been used in statistical model development, adaptive control systems, pattern recognition in data mining and decision making under uncertainty.

Deep learning is part of a range of machine learning approaches based on artificial neural networks and presentation learning. Learning can be supervised, semi-supervised or unsupervised, or even reinforcement learning.

Deep learning is different from traditional machine learning

“Deep learning is actually a new name for an artificial intelligence approach called neural networks, which has been popular for over 70 years,” one person said. But that’s not accurate. Deep learning is different from traditional machine learning. By “traditional machine learning,” I mean the ordinary neural networks, or shallow neural networks, of the 20th century.

Indeed, the relationship between computers and the brain captured the attention of computer pioneers in the 1940s. For example, in June 1945, John von Neumann used biological terms like “memory,” “organ,” and “neuron” when he first described the key architectural concepts of modern computing in his Draft EDVAC Report. Von Neumann also wrote an unfinished manuscript of Computer and Human Brain, which analyzed the relationship between computer and human brain nervous system from a mathematical perspective. In 1943, for example, Warren McCullough and Walter Pitts first proposed neural networks, models of neurons capable of implementing Boolean logic statements.

The first major neural network breakthrough came in the mid-1960s, when Soviet mathematician Alexey Ivakhnenko, with the help of his assistant V.G.Lapa, created small but powerful neural networks using learning algorithms with supervised deep feedforward multilayer perceptrons. The single-layer perceptron was invented by Rosenblat in the 1950s.

In the early 1980s, John Hopfield’s Recurrent neural networks made a splash, followed by Terry Sejnowski’s program NetTalk, which could pronounce English words.

The term “deep learning” became widely used in 2006 by Jeffrey Hinton, a computer scientist and professor at Carnegie Mellon University. Hinton wasn’t the first to use the term, though: in 1986, a paper by R. Dechter introduced the term “deep learning” to machine learning. It was first introduced into artificial neural networks in 2000 by Aizenberg et al.

What is the difference between deep learning in the 21st century and traditional neural networks?

First, artificial neural networks contain hidden layers between input layers and output layers. Traditional neural networks contain only one or more hidden layers. Deep learning is a very large neural network with far more hidden layers (typically 150) that can store and process much more information. This is the most important difference between deep learning and traditional neural networks. Therefore, the name “deep” is used for such networks.

Second, deep learning does not need to manually extract features, but directly takes images as input. This is another difference between deep learning and traditional neural networks. Figure 1 depicts the process followed to identify objects in machine learning and deep learning.

Third, deep learning requires a high-performance GPU and a lot of data. Feature extraction and classification are performed using a deep learning algorithm called convolutional neural network (CNN). CNN is responsible for feature extraction and classification based on multiple images. As the amount of data increases, the performance of deep learning algorithms also improves. On the contrary, when the amount of data increases, the performance of the traditional learning algorithm will decrease.

FIG. 1 Machine learning and deep learning

In machine learning, more information needs to be given to the algorithm (for example, by performing feature extraction) to make accurate predictions. In deep learning, due to the structure of deep artificial neural network, the algorithm can learn how to make accurate predictions through its own data processing. Table 1 compares the two technologies in more detail:

Table 1 Comparison between deep learning and traditional machine learning

Traditional machine learning

Deep learning

Number of hidden layers

One or a few hidden layers.

Lots of hidden layers.

Number of data points

Predictions can be made using small amounts of data.

A large amount of training data is required to make predictions.

Hardware dependency

Can work on low-end machines. It doesn’t require a lot of computing power.

Rely on high-end machines. It does a lot of matrix multiplication. Gpus can effectively optimize these operations.

Characterization process

User involvement is required.

Automatically learn features from data.

The execution time

Training takes relatively little time, ranging from a few seconds to a few hours.

Because deep learning algorithms involve multiple layers, they often take a long time to train.

The output

The output is usually a numeric value, such as a score or classification.

The output can be in a variety of formats, such as text, score, or sound.

The three Godfathers of deep learning

Yoshia Bengio (born May 3, 1964) is a Canadian computer scientist best known for his work on artificial neural networks and deep learning. He is a professor in the Department of Computer Science and Operations Research at the University of Montreal and scientific Director of the Montreal Institute for Learning Algorithms.

Yann LeCun (born August 8, 1960) is a French computer scientist whose research interests include machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver professor at NYU’s Courant Institute for Mathematical Sciences and is Facebook’s vice president and chief AI scientist.

Geoffrey Everest Hinton (born December 6, 1947) is an English and Canadian cognitive psychologist and computer scientist, best known for his work on artificial neural networks. Since 2013, he has split his time between Google and the University of Toronto. In 2017, he co-founded and became chief scientific Advisor to the Vector Institute, an artificial intelligence research organization.

Figure 2 Le Cun (left) and Sinton (center) and Bengio (right),

The 2018 Turing Prize has been awarded to three researchers who laid the foundations for the current artificial intelligence boom. Bengio, Lecura and Hinton, sometimes referred to as the “godfathers of AI,” were honored for their work in developing the field of deep learning. Technology developed by the trio in the 1990s and 2000s has led to major breakthroughs in tasks such as computer vision and speech recognition. Their work supports the development of AI technologies ranging from driverless cars to automated medical diagnostics.

Back in the mid-1970s, the “AI winter” reduced funding and enthusiasm for AI research. But Jeffrey Hinton stuck to one area of neural network research: the ability to model the development of neural node networks to mimic the human mind. In 1986, Hinton and several other researchers helped neural networks improve shape recognition and word prediction by demonstrating that more than a few neural networks could be trained by back propagation. In 2012, Jeffrey Hinton, along with his students Alex Krizhevsky (who was born in Ukraine but grew up in Canada) and Ilya Sutskever, improved on convolutional neural networks in a program they developed Far more than any other entrant in ImageNet, an image recognition competition involving thousands of different object types. Hinton’s team used graphics processor chips in a network of “60 million parameters and 650,000 neurons” (consisting of “five convolutional layers, some of which are followed by the largest pooling layer”). The convolution layer was one of Lecun’s original ideas, and Hinton’s team made significant improvements on it. Hinton has also long held to his belief in the potential of “unsupervised” training systems, in which learning algorithms attempt to identify features without providing numerous examples of markers. Hinton argues that these methods of unsupervised learning are not only useful but also bring us closer to understanding the learning mechanisms used by the human brain.

In 1988, Jann Lecun developed a biologically inspired image recognition model, convolutional neural network, and applied it to optical character recognition. Lecun proposed an early version of the back propagation algorithm and derived it clearly based on the variational principle. In 1998, LeNet5 was developed and MNIST, a classic dataset called “the fruit fly of machine learning” by Jeffrey Hinton. Lecura left industrial studies in 2003 to become a professor of computer science at New York University’s Courant Institute of Mathematical Institute, a leading center for applied mathematics research in the United States. It has a strong place in scientific computing, with a particular emphasis on machine learning. At NYU, Lecun continues to work on machine learning algorithms and computer vision applications in the Computational and Biological Learning Laboratory. Lecun maintained his love of construction, including interests in building airplanes, electronic Musical Instruments and robots. Since December 2013, he has been employed by Facebook to work on ARTIFICIAL intelligence research and is now Facebook’s chief AI scientist.

In 2000, Joshua Bengio wrote a landmark paper, A Neural Probabilistic Language Model (Resources [2]), that had a huge and lasting impact on natural language processing tasks, including language translation, question answering, and visual question answering. Since 2010, Bengio’s papers on generative deep learning, and in particular the generative adversative network (GAN) he developed with his doctoral student Ian Goodfellow, have revolutionized the field of computer vision and computer graphics. Bengio himself has co-founded several startups, most notably Element AI in 2016, which develops industrial applications for deep learning technologies. In 2017, Joshua Bengio, Ian Goodfellow, and Aaron Cuwell published Deep Learning, a foundational textbook in the field of deep learning, also known as the “Flower Book”, which is regarded as the “Bible” of deep learning.

[Deep learning technology breakthrough in 20110s]

The shift from traditional machine learning to deep learning, marked by object recognition, took place around the early 2010s. But in the years leading up to 2010, preparations had already been made for this shift, including algorithms (” deep learning “), building image databases (” ImageNet “) and boosting computing power (” Gpus “).

Since about 2016, deep learning has shown impressive results, first in speech recognition, then in computer vision, and most recently in natural language processing. The resulting algorithms have sparked a deep learning revolution in academic and industrial applications.

The following is a brief review of its development.

In 2006, Jeffrey Hinton et al. A paper was published (Resources [1]) showing how to train a deep neural network capable of recognizing handwritten digits with state-of-the-art accuracy (>98%). They call this technique “deep learning.” A deep neural network is a very simplified model of the cerebral cortex, consisting of a stack of artificial neuron layers.

In 2008, Andrew NG’s group at Stanford began advocating the use of Gpus to train deep neural networks in order to reduce training times several times. This brings the usefulness of deep learning to effective training on large amounts of data.

ImageNet was launched in 2009 by Fei Fei Li, a professor of artificial intelligence at Stanford University. Fei-fei Li is a Chinese-American computer scientist. The ImageNet project is a large visualization database designed for the study of visual object recognition software. More than 14 million images have been hand-annotated by the project, containing more than 20,000 categories. “Our vision is that big data will change the way machine learning works. Data-driven learning.”

FIG. 3 Li Feifei

In 2011, Joshua Bengio et al. showed in their paper “Deep Sparse Rectifier Neural Network” that ReLU activation function can avoid the disappearing gradient problem. This means that, in addition to gpus, the deep learning community has another tool to avoid the problem of deep neural network training taking too long and being impractical.

In 2012, Jeffrey Hinton, a professor at the University of Toronto, and his student Alex Krizhevsky, along with another student, built a model of a computer vision neural network called AlexNet to compete in ImageNet’s image recognition competition. Competitors will use their system to process millions of test images and identify them with the highest possible accuracy. AlexNet won the race with less than half the error rate of the runner-up. The victory sparked a new wave of deep learning around the world. AlexNet was developed and improved upon LeNet5 (figure 4A), which was built by jann le village many years ago. AlexNet is a multi-layer convolutional neural network for image classification (FIG. 4B). AlexNet architecture includes five convolutional layers and three fully connected layers (in contrast, LeNet is a five-layer convolutional neural network model, which has two convolutional layers and three fully connected layers).

(A)

(B)

Figure 4 Computer vision neural network model of Lenet-5 (A) and AlexNet (B)

In 2012, Google Brain published the results of an unusual project called the cat experiment. The project explores the difficulties of “unsupervised learning”. The cat experiment used a network of 16,000 computers to train themselves to recognize cats by watching 10 million “untagged” images from YouTube videos. At the end of the training, one of the highest levels of neurons was found to have a strong response to images of cats. “We also found a neuron that responds very strongly to faces,” said Ng, the project’s founder. .

In 2014, generative adversarial neural networks, also known as GAN, were created by Ian Goodfellow. With its ability to synthesize real data, GANs has opened the door to a whole new range of deep learning applications in fashion, art, science and more.

In 2016, DeepMind’s deep reinforcement learning model, AlphaGo, beat the human champion in the complex game of Go.

In 2019, Bengio, Lecura and Hinton won the 2018 Turing Award for their contributions to deep learning and artificial intelligence.

In 2020, OpenAI released GPT-3, a natural language deep learning model with 175 billion parameters. In the same year, DeepMind’s ARTIFICIAL intelligence program, AlphaFold2, predicted that protein structures were comparable to laboratory levels.

【 conclusion 】

Scientists who explored perceptrons and artificial neural networks in the 20th century were based on the idea that similar networks might be able to learn to recognize objects or perform other tasks, just like the human brain. Deep learning made great achievements in the 2010s and became the driving force for the artificial intelligence boom. The success of deep learning has been used to recognize or classify objects in photos, self-driving cars, games, automatic machine translation, image captioning generation, text generation, toxicity detection of different chemical structures, prediction of 3D structural shapes of proteins, etc. Deep learning has become a disruptive technology. One day, driverless cars will know more about the road and drive better than you do; A deep learning network will diagnose your illness.

The four blog posts on “the talk” (the first work of artificial intelligence, the rise and fall of perceptrons, the resurgence of neural networks and the glory of deep learning) review several important events in the development of neural networks and deep learning from the 1940s to the present. It can be seen that :(1) multidisciplinary collaborative research is important. (2) Don’t rush to deny or exaggerate new technologies. (3) The persistent efforts of scientists have contributed to the brilliance of today’s deep learning and the prosperity of artificial intelligence.

The prosperity of deep learning is also reflected in the publication of many books on deep learning, especially the toolkit on deep learning programming, which provides many conveniences for further learning.

【 Pegasus Club 】

Past welfare

******** pay attention to the pegasus public number, reply to the corresponding keywords package download learning materials; ******** reply to “join the group”, join the Pegasus AI, big data, project manager learning group, and grow together with excellent people!

From beginning to research, the 10 most Readable books in the field of artificial intelligence

RSVP number “2” machine learning & Data Science must-read classic book with resource pack!

Reply number “14” small white | machine learning and deep learning required books + machine learning field video/PPT + large data analysis books recommend!

Respond to digital “29” dry | 28 this big data/data analysis, data mining ebook collection of free download!

Reply number “36” 286 pages PDF teach you how to understand deep learning algorithms, theories and computing systems! (Downloadable)

Answer number “46” 700 pages of Graphics Deep Learning (with code and download)

The number “47” reply good books recommend | 106 pages of the Python advanced Chinese version download (attached)

Reply to the number “48” 20 most influential data science research papers, with electronic version ~

Reply number “131” download | white paper on “cognitive neural basis of artificial intelligence in PDF version

Reply number “132” download | 2020 city super smart brain and construction specification report (PDF)

The number “133” reply download | jingdong “white paper on the future trend of science and technology