preface

During the Spring Festival, I had nothing to do. I was browsing my blog and came across an article I had written before:

  • An article to solve the problem of machine learning, deep learning introduction

This post was originally published in mid-2018, when deep learning and neural networks were all the rage. How hot? So hot that graphics cards were once sold out (not because of mining), graduate tutors recommended students to switch to the direction of deep learning, they could not graduate without the word “deep learning” in their graduation thesis, and every student on the street knew Andrew Ng.

At that time, I was also determined to step into the ranks of deep learning. I started to follow stars (Endar Ng and Hongyi Li), took classes (CS231n and CS229), learned frameworks (Pytorch and Tensorflow), and began to write papers (escape ~).

But now that 2021 is over two years later, what happens now? In 2018, zhihu’s hot search was already “algorithm gang Immortal fight”, 2019 was “Ragnarok”, 2020 was “ashes”, and 2021 is what I do not know, I am also curious. But what is clear is that people are already wary of AI positions, especially CV positions.

In fact, the reason for this phenomenon is that CV is in an awkward position with a low threshold and a large number of job seekers but few jobs. On the one hand, a large number of students are flooding in, and on the other hand, the sluggish economic environment and increasingly full employment positions are on the other hand. Naturally, the difficulty of employment of relevant posts has reached 18 levels of hell.

This doesn’t mean that deep learning is over, just that it’s not as blind as it used to be.

We’re past the pompous stage, past the braggadocio stage, and now we’re in an era where deep learning has become the infrastructure that permeates every industry and field we work in. For example, face recognition has been basically mature (now stations are basically through face recognition to achieve id card and face verification), voice recognition OCR recognition technology is also in application.

This is the wave of intelligence, automation. The hard part is not the CV, the hard part is how not to miss this era.

An article to solve the problem of machine learning, deep learning introduction

The following text was written in 2018 and amended on February 15, 2021.


Without further ado, look at the news, artificial intelligence has been recognized by the country:

This shows that the country attaches great importance to the process of artificial intelligence, learn as early as possible, let us also contribute to the strength.

What this article is about

During the postgraduate period, I chatted with several seniors and younger students occasionally. Most of the topics are not related to machine learning deep learning neural networks, which have been popular in recent years, but the purpose of this article is to look at the general direction and trends from the perspective of students, whether they are graduate students or undergraduates-to-be. Also for the beginner or prepare an introduction to machine learning, deep learning, computer vision, image processing, speech recognition and processing, and so on and so on related direction of children’s shoes, children’s shoes or inquiry learning related direction lost the clear some basic concepts, as well as the solutions to these “the noun” represents what, how should choose and how to start, after all, who was also confused, More inheritance means less detours.

The majors involved

More, now what professional touch point relationship, as long as and algorithm can make machine learning about deep learning neural network, to give you a few more common profession: machine learning, deep learning, image processing, computer vision, data processing, data mining, information retrieval, intelligent system and parallel computing, graphics, image processing, data processing and visualization analysis…

Two books on my desk, the classic watermelon Book and the Deep Learning Bible.

Related applications involved

There are many applications of machine learning and deep learning, such as image segmentation, language recognition, super resolution, image tracking, image generation and so on. Specifically, there are related applications in each small field. Here, two application scenarios are simply introduced.

  • Machine learning: For example, taobao recommends that you spend a day or two trying to find a game console. You open the Taobao App on your phone, click search, type in the name of a game console (switch, for example), and check out some stores that sell the game console. This time Taobao has been recording your information, no matter you open which shop owner’s shop, Taobao will collect your preferences, Taobao will record every link you open, the content of every link will be recorded for you. The next time you log in to taobao App, Taobao will analyze your hobbies and characteristics based on the information collected on your side and machine learning algorithm, and then push you some items that are suitable for you and that you might like.
  • Deep learning: Or the popular face-changing project -DeepFake:

The feature information of one image is extracted by variational autoencoder and then restored according to the feature information of another image

In each of the three images above, trump’s face is on the far left and Nicola’s face is on the far right, which is generated by extracting features from an autoencoder. This is a deep learning application, and if you’re interested in this project, you can read about it.

The body of the

First of all, the main body may not be as much as you think. This article does not attempt to generalize and sort out all the knowledge of machine learning and deep learning, which is meaningless. Because, it is a machine learning algorithm is involved in the formula many, many, write so many formulas for beginners is not friendly, was an introduction to very simple one thing there is no need to complicate, the second is the positioning of this article is not machine learning is not deep learning materials, it certainly isn’t science, this article is based on the perspective of university students to talk about this topic, Let me share some of my experiences with you.

Get down to business. No matter what you’ve been doing, what you’ve been studying, what you’re in, don’t worry. Some basic programming (C/C++) and some basic mathematics (advanced numbers, linear algebra, probability and statistics) are sufficient, and the main language used for machine learning is Python. Python, you’ve probably heard of it, whether you’re familiar with it or not, and yes, it’s hot, it’s powerful, it’s awesome, it can do anything, and machine learning is very closely related to Python. Many deep learning libraries are programmed in Python, so learning this language is also necessary.

Machine learning

First of all, the relationship between machine learning and deep learning is explained: Deep learning is a subset of machine learning, and deep learning is a small part of machine learning. If you want to understand deep learning, you must first understand machine learning.

The above figure shows the relationship between artificial intelligence, machine learning and deep learning. Artificial intelligence covers a wide range, including more than machine learning, while deep learning is only an important subset of machine learning. It is easy to confuse the beginning, although theoretically deep learning to a part of the machine learning, machine learning and deep learning mainly locate target still can distinguish, machine learning is mainly to deal with the text data (the data in the table), and deep learning is mainly processes the image and voice of these aspects.

As for the history of machine learning, there is not much to say here, but you can find a lot of it on the Internet. Also check out the history of machine learning in the machine learning Watermelon book (pictured above).

So what is machine learning? It’s essentially using mathematical algorithms to solve real problems. And we’ve seen some of these mathematical algorithms. Those of you who have done mathematical modeling are more or less exposed to machine learning, when you are using MATLAB to perform linear fitting of a function, you are already using machine learning.

The general process of machine learning can be broadly divided into:

  • Get the data you need (get the data, organize the data, enhance the data)
  • Analysis of data (data preprocessing)
  • Process the data using the corresponding algorithm (machine learning algorithm)
  • And then the process of getting the results. (Output result)

To put it bluntly, suppose we think of the algorithm as a simple function (that function is your algorithm) : y = x + 2. X is the input which is the data, y is the output which is the result you want to get, and x plus 2 is the process of your algorithm.

Let’s do a slightly more complicated function: 5 = w*x + 4.

Isn’t this function an equation? Yeah, it’s an equation, and if we get the value of x, we can solve for w accordingly. But instead of thinking about it in terms of equations, let’s think about it in terms of machine learning.

Let’s say we can’t see the result of that function right now. The above function is a machine learning algorithms, we designed to make the algorithm through the input data we give x and then get a result (assuming that we will get the result is 5), so the above machine learning algorithms (function) is our own design, we want to get the output through the input x 5.

So we designed it: 5 = w*x + 4.

And then what do we do? Machine learning, that’s the process by which machines learn. In machine learning, we want the algorithm that we design to gradually learn how to get to the end result, and this process is achieved through training.

Training is definitely different from actual combat. Generally, in our machine learning process, our data set is divided into two parts, training set and test set, which is equivalent to practice and actual combat. Of course, the training process is for supervised learning, and what is supervised learning, in a nutshell, is labeled, labeled. For example, if we give you a picture of a cat, the corresponding tag is 1, and if this picture is a dog, the corresponding tag is 2, we ask machine learning to judge a series of pictures with cats and dogs, and we give them the correct picture and corresponding tag during the training process. You can think of it as giving a little kid a picture of a cat and he says 1, giving him a picture of a dog and he says 2, and if you make a mistake and you make the kid remember the features of the dog, maybe the next time it won’t make the mistake. This kind of learning process is supervised, and in the process of learning, you know whether you are making the right judgment or not, or someone is watching you and telling you what your judgment is wrong.

Above the picture is the most the right algorithm results, the highest score is an algorithm to determine the most likely outcome (the above judgment is wrong, obviously was the cat’s figure, but the cat got negative, and the dog’s score is 437, obviously this algorithm the wrong to cat recognition as a dog, need to continue to “practice” (training)).

Unsupervised learning, on the other hand, does not require labeling and allows you to judge by your own perception, like giving you a stack of triangular, round, or rectangular cards and asking you to arrange them into three categories. You classify the cards into three categories by distinguishing their shapes.

Back to our previous problem: 5 = w*x + 4.

First of all, we don’t know what w is, so let’s call w the weight. We assume that the w is 1 (this is the initialized weights, and initialization method also has a lot of), then we require input x is 2 (the input is fixed, here is also can be to 2, 2, here is equal to the data, the data obtained from the life, such as you to automatic water machine into 1 COINS will come out a bottle of mineral water, 1 input here, You don’t get mineral water for 0.5 or 2 coins).

So x is 2, w is 1, so this is our first attempt, we run through the algorithm, and pretty soon we see that 1 times 2 plus 4 is 6. Six doesn’t make five, so of course it’s wrong.

Is it wrong to be wrong? In order to achieve the correct result, we need to design a standard. Here we design a Loss, hereinafter referred to as L. We make L = y-5, which is the difference between the result of our algorithm and the correct result (the correct result here can also be called Ground truth). The Loss function is the Loss function. The Loss function indicates how big the gap is between the result obtained by our algorithm and the actual result. When L is 0, it indicates that our algorithm can get the result we want perfectly.

But obviously L here is 6-5=1, and the loss is not 0. We take the derivative of the loss L with respect to the weight W through the chain rule:

dL/dw = dL/dy * dy/dw

So obviously you get dL/dw is equal to 1 times 2.

So we get the derivative of the loss to the input, and here we define a learning rate: R (learning rate), which we set as: 0.1, and then we get the gradient descent formula of X:

w = w - r * dL/dw

Insert the numbers: w = 1-0.1 * 2 = 0.8

At this point, the weight W is 0.8. Back to the original formula: W *x + 4 = 0.8 * 2 + 4 = 5.6

Although 5.6 is still not equal to 5, it is slightly closer to the previous result 6. It can be seen from L that L = y-5 = 5.6-5 = 0.6, and the loss decreases after a “learning process”.

It’s easy to see here that if the learning rate were to change to 0.2 wouldn’t the loss go down faster? Of course, the learning rate is a hyperparameter (that is, a parameter set based on experience), which is set by ourselves. It is not good to be too high or too low, but appropriate.

To borrow a picture from cs231N:

Epoch means training times. The results caused by different learning rates are different, as long as it is appropriate.

In this way, we only trained once to reduce the loss from 1 to 0.6. When we trained for many times, the loss would continue to decrease, but it should be noted that the loss would not be “exactly” reduced to 0, which was very rare. In our subsequent experiments, the loss would be reduced to a certain degree and no change would occur, indicating that the training had ended.

So much for machine learning.

Deep learning

Deep learning is a part of machine learning, which can be summarized as neural networks with deeper network layers.

Neural network explains a lot of content on the web, not to repeat, you can see the relevant information section below.

The relevant data

There are a lot of materials related to machine learning and deep learning that would take more than 100 pages to list in a comprehensive way, so I only recommend some of the most cost-effective materials for beginners. In addition, machine learning and deep learning are theoretically inseparable, but since it is better to have a sequential process of learning, MY suggestion is to have a comprehensive understanding and practice of machine learning before carrying out deep learning, so that you will have a solid foundation.

books

  • Recommend programmers mathematics series, a total of three (the first a little chicken ribs, after two more dry), respectively is the programmer’s mathematics 1, linear algebra (programmer math 2) and the probability and statistics (programmer mathematics 3), the three can effectively supplement we need knowledge of linear algebra and probability statistics.

  • “Zhou Zhihua – Machine Learning – Tsinghua University Press” can be used as a reference book on the desk, similar to the textbook, if read chapter by chapter is quite boring. Advice when there is demand, there are a lot of formula and the basic knowledge in the book: book.douban.com/subject/267…

  • The machine learning field, you can see from the name, this book is mainly to use python programs to write machine learning algorithms, interpretation of the formula is less, more practical code: book.douban.com/subject/247…

  • Deep learning – name of posts and telecommunications press, the same as the reference desk, the contents of the book is very detailed, depth of the previous chapter the learning needs of machine learning is the foundation, in the later began to explain a series of deep learning knowledge as well as the formula, basic now used by most of the deep learning algorithms have told above, pseudo code and formula: Book.douban.com/subject/270…

  • Professor Li Hang’s “Statistical Learning Methods”, this book can cover 99% of the basic knowledge of machine learning with the watermelon book on machine learning, but also this book only explains some principles and formulas, it is best to practice with actual combat.

Video data

  • Wu En Wu En da series: online said of course is usually the two: one is the machine on Coursera – learning | Coursera, the other is Stanford university courses: S229 :Maching Learning, both of these courses explain mathematical knowledge related to machine Learning. The former is simpler than the latter. There are many formulas involved in this course, but there are few explanations on engineering practice. If you are not familiar with English or slow to load videos (videos sometimes need to be turned over the wall), you can consider watching domestic handling: The first lesson of Artificial Intelligence by Andrew Ng, netease Stanford Machine Learning By Andrew Ng open course. In addition, Ng is planning to publish a new book that combines engineering practice and mathematical explanations. It is completely free and can be obtained by subscription only. If you are interested, you can read it here.

  • Speech.ee.ntu.edu.tw/~tlkagk/tal… Famous Chinese machine learning course, the lecture is very careful, the use of vivid and interesting examples, the lecture is humorous, knowledge and interest coexist, also recommended.

  • Cs231n: Deep Learning courses at Stanford: Cs231n.stanford.edu/, it can be said that this is the best and most dry deep learning course in the whole network. It explains all the basic knowledge required by neural network from the most basic classification algorithm. It also includes several classic neural network architectures (VggNet, ResNet) and several well-known deep learning applications in deep learning. Compared with many deep learning frameworks, the most important thing is that the homework is worth doing, which is difficult to some extent and can learn many key points. Anyway, this course is highly recommended.

  • Yoda City series. Yoda City courses are known for their excellent engineering examples. At the end of each course, the projects included are mostly interesting and applicable, such as predicting house prices, sorting dog breeds, generating movie scripts, etc. There are many courses, but each section is short. There are questions to answer in each section. There is also a lot of supplementary knowledge for you to learn, although some of the knowledge may not be very deep, but it is suitable for beginners. It’s just a little… The courses are not free. To be brief, the deep learning course is offered in two semesters: 3,299 yuan for the first semester and 3,999 yuan for the second semester.

About the Cost of learning

This is a platitudes of the issue, zhihuon on various immortal views, a thousand readers a thousand Hamlets. I also encountered such problems at the beginning, chose a period of time, here to summarize.

Here are the answers for 2018:

If the machine learning part of your field (which doesn’t involve neural networks, or when neural networks are not very layered) is not very demanding, a regular laptop will do. It doesn’t matter if you have a video card. Direction is deep learning, but if you need to deal with images or video information, especially when you design the neural network when the number of layers is more (layer deep deep learning), the role of the graphics will be reflected, graphics run deep learning code for graphics options suggested (is in front of the model, followed by memory size) : GTX 1060 6G, the most cost-effective GTX 1070 8G, need to in-depth study GTX 1080TI 11G, of course, you can choose Titan graphics card (3W +) or multiple graphics cards, in short, wealth limits imagination. If you already have an Nvidia graphics card on your computer, you can compute the capacity of your graphics card. If you already have an Nvidia graphics card, you can compute the capacity of your computer. Developer.nvidia.com/cuda-gpus. Ps: Because mining and other factors such as graphics card prices are more severe, the purchase or have to buy, to convert wealth into productivity. With a CPU, GPU testing deep learning speed analysis: github.com/jcjohnson/c…

Thoughts for 2021:

I have not used the latest series 30 and will not comment on it. 20 series of graphics card I only used 2080Ti (semi-precision reasoning is very good), but what to buy is actually important to see video memory and computing features (whether support single precision semi-precision, etc.), according to the need to buy, can refer to the article I wrote before, or have reference significance:

  • Here’s a list of configurations: Machine learning, Deep Learning computer graphics card configuration guide

Afterword.

Happy New Year and the Year of the Ox

Lao Pan hopes that each of us can find our own direction in time, and only when the direction is right, hard work will be rewarded.

Share a wave of resources mentioned above, collect them on the network, reply to the corresponding code:

  • Reply “010” to get programmer math series
  • Reply to “011” for deep learning series

communication

If you are like-minded with me, Lao Pan is willing to communicate with you; If you like Lao Pan’s content, welcome to follow and support. The blog is updated with an in-depth original article every week. Follow the public account “Oldpan Blog” and don’t miss the latest articles. Lao Pan will also organize some of his own private collection, hope to help you, the public account reply “888” to get Lao Pan learning route information and article summary, there are more waiting for you to dig. If you don’t want to miss Penn’s latest tweets, check out the mystery link.