1. The background

Spent some time getting to know ML Machine Learning this year, and get a general idea of what it does and how it does it. And Google I/O 19’s talk about ML, Machine Learning Zero to Hero (Google I/O’19), is without a doubt the best popular science video about ML I’ve ever seen.

The content of this video mainly includes:

  1. What is ML, and how does it differ from what we normally think of as transformation?
  2. An example of ML
  3. How can TF solve ML problems more effectively?
  4. How can you use ML

Here are some notes about the lecture. I watched it on Youtube, and I’m sure there are other websites that play this useful video.

2. What is ML?

2.1 What can’t we do with ordinary programming?

Suppose we were trying to write a program to play rock-paper-scissors. The steps were simple:

  1. The computer randomizes its own hand gestures while filming your hand gestures through the camera
  2. After identifying one of the three “rock-paper-scissors” gestures, compare your hand gestures to the winner

In this game, random gestures, judging winners and losers is our normal programming area, but how to program to get gestures from pictures? The answer is: hardly. The data a computer receives is only a binary pixel, and the rules are almost impossible to describe in ordinary programming languages.

If an image is identified as a scissors by analyzing its pixel layout, the next image, which is almost completely different from the pixel layout, should also be identified as a scissors. So the rules in this domain are not programmable. There are many fields in which input data cannot be programmed, such as computer vision, speech recognition, natural language processing and so on. Based on solving these problems, we can create valuable applications such as autonomous driving and machine customer service.

But this Rule is useful. Is there another way to get this Rule data? The answer is yes, and this is the scope of Machine Learning.

2.2ML: Data + results = rules

As you can see from the figure above, traditional programming is rules + data = results. ML is data + results = rules, and then in production, rules + data = results. The process of obtaining rules is called training, and the rules obtained by training are called Model, and the process of adding new data to Model is called predict.

TensorFlow is our framework for training, acquiring, and using the Model. Cool!

2.3 Neural network

The previous section is about the definition and terminology of ML. Now we will follow the game of “rock, paper, Scissors” to further explain some ML concepts about image recognition.

If the computer were to train the Model to recognize gestures simply by using a large number of 1980×1080 high-resolution images, the process would be very inefficient. We need to help the computer do some pre-processing. The function of these pretreatments is to teach the computer to find feature features.

The so-called trait is some kind of dimensional data. It’s a very easy concept to understand. People, for example, have skin, double eyelids and straight curly hair. If I need to train a person to recognize a yellow person, a black person, a white person, then we think skin is an important feature.

In other words: specific problems require specific characteristics to be solved. If we can capture too many features from the data (images), we can solve different types of problems, or even solve a problem with multiple features, making our Rule more consistent with the correct answer. (We can more accurately identify black people by skin color plus curly hair)

2.3.1 Convolutions convolution

The so-called feature extraction is to obtain A feature data by ignoring some a-dimension details and magnifying b-dimension details through certain algorithms. As shown in the figure above, according to the relationship between pixels, new pixels can be obtained through certain algorithms and combined into a new photo to obtain an eigenvalue. Our algorithms become filters as shown below, typically: data + filter = feature data. We call this process convolution, and in fact, convolution is more mathematically defined, and I’m just going to introduce it in ML.

After convolution processing, some parts of the data will be highlighted and enhanced, which is called feature data. Based on these data, ML can obtain the features, and we can’t even understand how these features help the computer to complete recognition.

ML obtains more features through convolution, which may lead to a picture with 1000 feature data after convolution processing. The efficient processing of these characteristic data has become a difficult problem in training. In the next section, we need to solve this problem.

2.3.2 Pooling Pooling

The purpose of pooling is to compress data without losing features. This is very easy to understand, we need to identify the “stone” on a picture, in fact, 64×64 image is enough for us to identify, the effect is the same as 1980×1080.

In the same way, we need to pool our training to be more efficient. That would increase efficiency thousands of times.

The purpose of pooling is compression, which can be done in a variety of ways. In the figure above, just take the maximum of four pixels. In fact, we don’t see much change in the content characteristics of the image because of pooling, but the computing efficiency of the computer is greatly improved.

2.3.3 layered

We briefly introduced the concepts of convolution and pooling in the previous section. How do you combine them into a Model?

The figure above is a framework for constructing neural network. The first is the Input cell, and the convolution and pooling we introduced are in the second and third layers. After feature extraction, we will get some simplified feature data. This is our description of the previous sections.

The last Hidden cell and Output cell are the composition of standard neural network. It is very simple to understand the composition of standard neural network, which is a high school mathematics knowledge, but not easy to describe. It is suggested to learn about Ng’s ML tutorial.

So now we’re finally done with the concept of neural networks. So how do we make our neural networks? How to train, save and apply? That’s what TensorFlow does.

3 TensorFlow

TensorFlow is a machine learning framework whose value lies in simplifying ML training, saving, and using applications. In other words, once we understand the knowledge above, we can verify and apply it through frameworks and procedures.

Of course, TF is only one kind of machine learning framework, the same ML theory is also applicable to other frameworks, it is just different software has its own way of using it.

I can’t go into the specifics of how to use TensorFlow, just to list the features mentioned in the talk.

3.1 Define neural network and training

The concept of massive ML that we discussed above can be found in the following code.

The code is pretty neat

  • Define a model through TF
  • There are many layers, first of all convolution and then pooling (you can see why!).
  • After that, it was delivered to the Dense layer for processing, and finally 3 nodes were output

The whole neural network is defined. Simple!

During training, we need to define loss function and optimizer. This is no mystery, they exist to converge training to a proper state. One is like a strict father, one is like a loving mother, one tells the model how wrong it is, one suggests ways to modify the model.

And finally fit is performing training.

After training, we found that some layers had tens of thousands of Param parameters, and the final dense layer, a million. It is almost impossible to understand these params; ML models are useful, but cannot be broken down and interpreted.

Here’s a quick look at the key components and features of TF, which are almost always out of the box.

3.2 TensorBoard neural network training visualization

TF provides visual components for neural network training.

3.3 Distributed parallel training

To speed up the training efficiency, TF supports distributed training.

3.3 Deployment and Application

We can’t just train a model and then call it a day. How the Model is serialized and stored after training and how it is published is also a capability that the framework must support.

  • TenserFlow Serving is an application provided by TF that opens the gRPC and REST interfaces to provide model recognition tasks. As an application service, it is efficient and has huge throughput.
  • TenserFlow Lite is a library provided by TF that can save the Model to some portable device, such as raspberry PI.

3.3.1 Save and load the model

How to save it is also very simple, as shown in the picture above. With these few lines of code, you should be ready to train, save, and publish.

4 Learning Materials

At this point, the Google ML speech notes are completed. I highly recommend you watch the original video, about 40 minutes. I don’t know much about ML myself, but I’m getting started and trying something new. I learned ML courses mainly from the following materials:

  • This is Dr. Andrew Ng’s ML course to popularize ML teaching video, have high school mathematics knowledge can understand. I watched about the first third of it.

  • Deep learning is known as the flower book. Based on Dr. Andrew Ng’s course, I learned the basics of ML in depth with this book, finished the first chapter, and could probably understand the basic concepts.

  • Siraj Raval is a Youtuber who specializes in machine learning videos. If you have a rudimentary idea of what ML is and can’t wait to see what it does and how it does it, it’s highly recommended that you take a look at how to simulate a driverless car. The video walks you through data collection, training, deployment, application, and the car on your game drives itself.

By the way

Thank you very much for reading this, please click a “like” if it’s convenient, you can leave a comment. Thanks!

I’ve been working on some ideas lately, and some of the little projects that Raspberry has been working on, you can check out Github to see how I’ve been able to build some interesting projects from scratch by building blocks, and ML is also an important building block. I have carefully recorded every step, and you can copy all my work from the log. Meanwhile, I am looking forward to your issue, and I hope to get your Star!