So far, WE have completed support vector machine SVM, decision tree, KNN, Bayesian, linear regression and Logistic regression. For other algorithms, please allow Taoye to give credit here for the first time. Later, we will have the opportunity and time to make up for you.

Update so far, also received part of the reader’s praise. It’s not much, but thank you very much for your support, and I hope everyone who reads it will find it rewarding.

The entire content of this series is Taoye pure hand written, which also refers to a number of books and open sources. The total number of words in this series is about 15W (including source code), and the total number of pages is 138, which will be gradually filled in later. For more technical articles, visit Taoye’s official account: Cynical Coder. The document can be circulated freely, but be careful not to modify its contents.

If you have any questions you don’t understand in the article, you can directly ask them, and Taoye will reply as soon as you see them. Meanwhile, you are welcome to come here to privately urge Taoye: Cynical Coder. Taoye’s personal contact information is also available on the public account. There are some things Taoye can only secretly say to you there (# ‘O’)

To improve your reading experience, Taoye, a series of articles on shredding machine learning, has been compiled into PDF and HTML. The results are very good, and you can download them for free on the public account [Cynical Coder]

preface

In the future, there are plans to update deep learning related content. Regarding deep learning, Taoye believes that the best way to learn it (getting started) is to implement it by hand, including its principles and code (as much as possible). Thinking about the process as you implement it is very helpful to understand deep learning correctly.

Therefore, this series of articles will start from scratch as much as possible, rather than start with “black boxes” such as Tensorflow, Keras, or Caffe.

This article mainly explains the perceptron, but also lays a good foundation for the following neural network content, this article mainly includes the following three parts:

  • What exactly is a perceptron
  • How to implement logic circuit and xOR problem throwing based on perceptron
  • Solve xOR problem perfectly based on two-layer perceptron

First, what is a perceptron

Let us begin by looking at the surface of the perceptron.

“Perceptron, perceptron”, since it has a “device”, then we can think of it as a “machine”?

Since it is a machine, it must be able to implement some of the functions we need instead of human beings, and we can submit some “things” to it, and it can also give us some feedback based on the “things” we provide.

Is that right??

Nothing wrong, brother Dei. That’s the way it is.

Perceptron it can receive multiple input signals, and according to the multiple signals output a signal. The input signals can be interpreted as “stuff”, or data, that we provide to the perceptron, while the output signals can be interpreted as the results of processing by the perceptron.

In the perceptron above, it receives two input signals X1, X2x_1, X_2X1, X2, and outputs a signal YYy after processing w1, W2W_1, W_2W1, and W2. Among them, W1, W2W_1, W_2W1 and W2 represent the weight, the higher the weight, the higher the importance of the signal, and each circle represents a “node” or “neuron”. After input signals X1,x2x_1, X_2x1 and X2 into neurons, they are multiplied by corresponding weights w1, W2W_1, W_2W1 and w2 respectively, and summed to obtain w1x1+ W2x2W_1x_1 + W_2X_2W1x1 +w2x2. If the sum exceeds a certain threshold θ\theta theta, then the output is y=1y=1y=1, otherwise the output is y=0y=0y=0. The threshold here can also be understood as a threshold, which represents how easily neurons can be activated.

The process can be expressed mathematically as follows:


y = { 0 . w 1 x 1 + w 2 x 2 Or less Theta. 1 . w 1 x 1 + w 2 x 2 > Theta. y = \begin{cases} 0, & w_1x_1+w_2x_2 \leq \theta\\ 1, & w_1x_1+w_2x_2 > \theta \\ \end{cases}

For ease of expression, the threshold θ\thetaθ is usually taken to the left and replaced with BBB, which denotes bias, and b=−θb=-\thetab=−θ. We can see that no matter how our perceptron processes the input signal, the coefficient before bias is equal to 1.

Therefore, while providing input signals X1, X2x_1, X_2x1,x2, we can add another constant signal 1, which is used to process bias BBB, and will not cause any influence on the processing of the whole perceptron. The processing results are as follows:


y = { 0 . w 1 x 1 + w 2 x 2 + b Or less 0 1 . w 1 x 1 + w 2 x 2 + b > 0 y = \begin{cases} 0, & w_1x_1+w_2x_2+b \leq 0\\ 1, & w_1x_1+w_2x_2+b > 0 \\ \end{cases}

By comparing w1x1+ W2X2 + BW_1X_1 + W_2X_2 + BW1X1 + W2X2 +b with 0, we can determine the value of the output signal YYy.

Perceptron can realize many functions, not only can realize simple logic circuit (Boolean operation), but also can fit any linear function, any linear classification or linear regression problems can be solved by perceptron.

Previously, we explained linear regression and logistic regression in detail in the process of hand tearing machine learning. The general process and principle of linear regression is actually equivalent to a perceptron, but at that time, we did not throw out the concept of perceptron directly.

Now we can use the perceptron to implement the basic logic circuit problem

How to implement logic circuit and xOR problem throwing based on perceptron

We might as well look at the implementation of and gate first.

And gate is a gate circuit with two inputs and one output. It is not difficult for us to know from permutations and combinations that two input signals indicate a total of four input situations. For gate circuit, only when the two inputs X1, X2x_1, X_2x1 and x2 are equal to 1, can the output signal YYy be equal to 1. In other cases, the YYY value is 0.

The “truth table” for the gate circuit is as follows:

In order to have a more intuitive understanding of the distribution of the data, we might as well visualize the four data first:

It can be found that the four data at this time completely separate the two sides, and we can use a straight line to classify them. According to the discriminant process of the perceptron above, we can know that it is mainly to determine the specific values of W1, W2, BW_1, W_2, Bw1, W2 and B to build a linear model. According to the value w1x1+ W2x2 + BW_1X_1 + W_2X_2 + BW1X1 + W2x2 +b calculated by the linear model, the classification of the data is finally determined.

We can see that, in fact, there are an infinite number of options for w1,w2,bw_1,w_2,bw1,w2, and b that satisfy this condition. For example, we make (w1, w2, b) = (0.6, 0.6-0.9) (w_1 w_2, b) = (0.6, 0.6, 0.9) (w1, w2, b) = (0.6, 0.6-0.9), the classification model for:


y = { 0 . 0.6 x 1 + 0.6 x 2 0.9 Or less 0 1 . 0.6 x 1 + 0.6 x 2 0.9 > 0 Y = \ begin {cases} 0, 0.6 & 0.6 x_1 + x_2-0.9 \ leq 0 \ \ 1, x_2 x_1 + 0.6 & 0.6 – > 0.9 0 {cases} \ \ \ end

Then, four sets of data x_data = [[0, 0], [1, 0], [0, 1], [1, 1] and y_label = [0, 0, 0, 1] are substituted into the above model. It can be found that the results obtained by the model are completely consistent with their corresponding label values, that is, the classification is completely correct. Relevant implementation codes and classification results are as follows:

We can find that the classification of the model is exactly right.

However, it should be noted that the perceptron model only aims to classify data correctly and does not have the requirement of maximizing the interval like SVM.

In order to facilitate readers to run the program, here is the complete code of the above process:

import numpy as np
from matplotlib import pyplot as plt

%matplotlib inline

"" Author: Taoye wechat public number: Skeptical Coder Explain: Visualizing Parameters with gate data: X_data: attributes of data y_label: labels corresponding to attributes of data W_1, w_2: weights b: bigotry """
def show_result(x_data, y_label, w_1, w_2, b) :
    plt.scatter(x_data[:, 0], x_data[:, 1], c = y_label, cmap = plt.cm.copper, linewidths = 10)
    line_x_1 = np.linspace(0.1.2.100)
    line_x_2 = (-b - w_1 * line_x_1) / w_2
    plt.plot(line_x_1, line_x_2)
    plt.show()

Author: Taoye wechat public number: Skeptical Coder Explain: Step function ""
def out(in_data) :
    return 0 if in_data < 0 else 1

Author: Taoye wechat public number: Cynical Coder Explain: Model calculation results ""
def model(x_1, x_2, w_1, w_2, b) :
    return w_1 * x_1 + w_2 * x_2 + b

if __name__ == "__main__":
    x_data = np.array([[0.0], [1.0], [0.1], [1.1]])
    y_label = np.array([0.0.0.1])
    for item in x_data:
        model_result = model(item[0], item[1].0.6.0.6, -1.1)
        out_result = out(model_result)
        print('%d and %d = %d' % (item[0], item[1], out_result))
    show_result(x_data, y_label, 0.6.0.6, -1.1)
Copy the code

The above is the whole process of solving and gate problem perceptron.

However, a careful reader would have noticed that the (W1, W2,b)(W_1, W_2,b)(w1,w2,b) parameters of our perceptrons are self-defined and thus enforced for the purpose of classifying data rather than being trained by data.

Therefore, it is necessary to explain the training process of the perceptron model.

Ah, actually this is the old long, before we in hand to tear machine learning algorithms, have approached the problem n times, does not get objective optimization function, and then by gradient descent algorithm and other optimization algorithm to update the model parameters, finally, trained to examine the effects of model parameters.

That’s basically it, but for the benefit of new readers, let’s go over the process again (those familiar can skip this part).

For a single data sample, didn’t we get the calculation result of perceptron model previously, that is:


y ^ = { 0 . w 1 x 1 + w 2 x 2 + b Or less 0 1 . w 1 x 1 + w 2 x 2 + b > 0 \hat{y} = \begin{cases} 0, & w_1x_1+w_2x_2+b \leq 0\\ 1, & w_1x_1+w_2x_2+b > 0 \\ \end{cases}

Note: above
w 1 . w 2 . b {w_1,w_2,b}
Parameters are not the end result, but we need to initialize them at the beginning, and then iterate over them to get the values that meet our needs.

Here, we use Y ^\hat{y}y^ to represent the calculation results of the model, and YYy represents the real label of the data. In order to measure the gap between the two through mathematics, we can use 12\frac{1}{2}21, the square of the difference between the two, to represent the gap between the two, namely:


e = 1 2 ( y ^ y ) 2 = 1 2 ( w 1 x 1 + w 2 x 2 + b y ) 2 = 1 2 ( w T x + b y ) 2 \begin{aligned} e& = \frac{1}{2}(\hat{y}-y)^2 \\ & = \frac{1}{2}(w_1x_1+w_2x_2+b-y)^2 \\ & = \frac{1}{2}(w^Tx+b-y)^2 \end{aligned}

The above is the error of a single sample, but as we know, there are multiple samples in the training data, so we need to integrate the error of all samples to reflect the classification effect of the model on the whole. We might as well set the overall error as EEE, then:


E = e 1 + e 2 + e 3 + . . . + e N = i = 1 N e i = 1 2 i = 1 N ( w T x i + b y i ) 2 \begin{aligned} E & = e_1+e_2+e_3+… +e_N \\ & = \sum_{i=1}^Ne_i \\ & = \frac{1}{2}\sum_{i=1}^N(w^Tx_i+b-y_i)^2 \end{aligned}

The above is our final objective function to be optimized. Our aim now is to get the values of parameters w1, W2, BW_1, W_2, Bw1,w2 and b, so as to minimize the error of the objective function. To this end, we need to take partial derivatives of WWW and BBB respectively, and the solution process is as follows:


partial E partial w = 1 2 i = 1 N partial partial w ( w T x i + b y i ) 2 = i = 1 N ( w T x i + b y i ) x i \begin{aligned} \frac{\partial E}{\partial w} & =\frac{1}{2}\sum_{i=1}^N\frac{\partial}{\partial w}(w^Tx_i+b-y_i)^2 \\ & = \sum_{i=1}^N(w^Tx_i+b-y_i)x_i \end{aligned}

Similarly, we take the partial derivative of BBB and get the following results:


partial E partial b = 1 2 i = 1 N partial partial b ( w T x i + b y i ) 2 = i = 1 N ( w T x i + b y i ) \begin{aligned} \frac{\partial E}{\partial b} & =\frac{1}{2}\sum_{i=1}^N\frac{\partial}{\partial b}(w^Tx_i+b-y_i)^2 \\ & = \sum_{i=1}^N(w^Tx_i+b-y_i) \end{aligned}

After obtaining partial derivative results, we need to use the gradient descent algorithm to update and iterate parameters w1, W2, BW_1, W_2, Bw1, W2 and B, and the specific update process is as follows:


w n e w = w o l d eta partial E partial w   b n e w = b o l d eta partial E partial b \begin{aligned} & w^{new}=w^{old}-\eta\frac{\partial E}{\partial w}\ \\ & b^{new} = b^{old}-\eta\frac{\partial E}{\partial b} \end{aligned}

As to why parameters are updated this way, Taoye won’t go into detail here, but for those who don’t understand, check out the hand-tearing machine learning series I wrote earlier.

Once we know how the parameters are updated, we can write programs to iterate over the data to get a final parameter that meets the actual requirements. First, we define the out_result method to output the calculation results of the model. Here we use a small trick: Y = result > 0 is used to judge the Boolean value of each sample processed by the perceptron. If the value is greater than 0, it returns true; if the value is less than 0, it returns false. After that, y.stype (np.int) is used to convert a Boolean value back to int, with true converting to 1 and false converting to 0.

Then, train method is defined to train parameters W1, W2, BW_1, W_2, BW1, W2, B. The training process is shown in the mathematical expression above, that is, gradient descent method is used to continuously update parameters to minimize the loss function. The specific code of train method is shown as follows:

The operation classification results are as follows:

Can be found, a total of iteration for ten times, eventually training out the values of the parameters for (w1, w2, b) = (0.6, 0.6-0.8) (w_1 w_2, b) = (0.6, 0.6, 0.8) (w1, w2, b) = (0.6, 0.6-0.8), namely the model as follows:


y ^ = { 0 . 0.6 x 1 + 0.6 x 2 0.8 Or less 0 1 . 0.6 x 1 + 0.6 x 2 0.8 > 0 \ hat {} y = \ begin {cases} 0, 0.6 & 0.6 x_1 + x_2-0.8 \ leq 0 \ \ 1, 0.6 & 0.6 x_1 + x_2-0.8 – > 0 {cases} \ \ \ end

When the four kinds of data of and gate are substituted into the perceptron model above, and compared with the real label, it can be found that the classification is completely correct, that is, the model training is successful. Here again, it is important to note that the perceptron model only aims to classify data correctly and does not have the requirement of interval maximization like SVM.

The complete code with the door implementation is as follows:

import numpy as np
from matplotlib import pyplot as plt

%matplotlib inline

"" Author: Taoye wechat public number: Skeptical Coder Explain: Visualizing Parameters with gate data: X_data: attributes of data y_label: labels corresponding to attributes of data W_1, w_2: weights b: bigotry """
def show_result(x_data, y_label, w_1, w_2, b) :
    plt.scatter(x_data[:, 0], x_data[:, 1], c = y_label, cmap = plt.cm.copper, linewidths = 10)
    line_x_1 = np.linspace(0.1.2.100)
    line_x_2 = (-b - w_1 * line_x_1) / w_2
    plt.plot(line_x_1, line_x_2)
    plt.show()

"" Author: Taoye wechat public id: Coder Explain: X_data: attributes of data w: weight vector b: paranoia Return: result of perceptron processing, in the form of a vector """
def out_result(x_data, w, b) :
    result = np.matmul(x_data, np.mat(w).T) + b
    y = result > 0
    return y.astype(np.int)

"" Author: Taoye 官 信 号: Coder Explain update iteration w, b Parameters: x_data y_label: data label max_iter: Maximum number of iterations learning_rate: learning rate W: weight parameter b: bias parameter Return: final w, b parameter """
def train(x_data, y_label, max_iter, learning_rate, w, b) :
    for i in range(max_iter):
        result = out_result(x_data, w, b)
        delta = np.mat(y_label).T - result
        w = (w + (learning_rate * np.matmul(x_data.T, delta)).T)
        b += (learning_rate * delta).sum(a)return w, b

if __name__ == "__main__":
    x_data = np.array([[0.0], [1.0], [0.1], [1.1]])
    y_label = np.array([0.0.0.1])
    w, b = train(x_data, y_label, 10.0.1, np.array([1.1]), 0)
    print(w, b)
    show_result(x_data, y_label, w[0.0], w[0.1], b)
Copy the code

The above is the perceptron realization and the whole process of the door, the reader can try to achieve according to the thought process of the above.

In the same way, we can realize the nand gate and or gate, logic circuit principle, op door model training process and also with the same, just transfer the data to model is different, that is to say, you just need to change the data to realize the nand gate and or gate, there is no longer a bit too much explanation, specific results are as follows:

It can be found that nand gate and or gate can also be realized after perceptron training.

So far, we have seen logic circuits that can represent and gates, nand gates, or gates using perceptrons. The important point here is that the perceptron construction of a gate, nand gate, or gate is the same. In fact, the three gates differ only in the values of the parameters (weights and thresholds). That is to say, perceptrons of the same structure can be transformed into and, nand and or gates just as “chameleon actors” play different roles by adjusting the values of parameters appropriately. — From Introduction to Deep Learning: Python-based Theory and Implementation

Since the perceptron can implement a nand gate, a nand gate, or the logic circuit of a gate, let us consider xor gates next.

Xor gate is also called logic or circuit. The output signal can be 1 only when one of its two input signals is 1 and the other is 0. The truth table corresponding to xor gate is as follows:

We may wish to visualize the data of the or gate to observe its distribution. The visualization code and results are as follows:

We can observe that no matter what we do, we can’t separate the two types of data by a straight line. In other words, the perceptron model obtained through the above analysis cannot be directly applied to the xOR gate.

The limitation of a perceptron is that it can only represent a space divided by a straight line, and a space divided by a straight line is called a linear space. But if we remove the linearity constraint, we can classify xOR gates perfectly, as follows:

The space formed by this curve segmentation is called nonlinear space, and linearity and nonlinearity are old acquaintances in the field of machine learning. This approach to linear inseparability is very common, but now that we’re in the deep learning world, we need to approach the problem as “quasi-deep learning” as possible.

This brings us to the third section of this article: Perfect xOR solution based on two layers of perceptrons

3. Solve xOR problem perfectly based on two-layer perceptron

Although the perceptron described above doesn’t solve the xor problem, we shouldn’t be too disappointed.

The power of perceptrons is that they can be stacked on top of each other, which is called multi-layer perceptrons, which is very helpful for many problems.

As for what is the superposition between perceptrons, let’s not think about this problem for the moment, let’s think about the xOR problem in terms of logic circuits.

Introduction to Deep Learning: Python-based Theory and Implementation

The above is the truth table of and gate, and not gate, or gate, xor gate, and we have explained how to solve the first three problems perfectly through the perceptron, and implemented it in code, but we still can’t solve the problem in the above way.

We might as well change the way of thinking to solve the next or problem, can through the door, and not door, or the combination of the two or even three in the door to solve or problem?? The requirements we want to implement now look like this:

In the figure above, the left subgraph represents the symbolic form of and gate, nand gate, or gate, and the right figure represents the xOR problem of goal realization. Can we fill the “three doors” in the left picture into the “?” So that the final output satisfies the result of the gate??

==================== Reader thinking boundary =====================

OK, after a short time of thinking, readers must have already had the answer. Worse still, there are only six ways to fill in by permutation and combination, and you can always get the answer.

In fact, we can implement xOR problems from top to bottom, left to right, “?” In turn fill in and not door, or door, and door. Details are as follows:

To be on the safe side, let’s see if this really solves the xor problem. Assuming that ** x1, x2x_1, X_2x1, x2 represent the initial input signal **, then:

  • S1s_1s1 represents the output of x1, x2x_1, X_2X1, x2 based on nand gate pairs
  • S2s_2s2 represents the output based on or gate pairs X1, X2x_1, X_2X1, and x2
  • Yyy represents the output based on the gate pair S1, s2s_1, s_2S1, s2, that is, the final output of the xOR gate

By inputting four sets of signals x1,x2x_1, X_2x1,x2, we can get the following results:

It turns out that the final output of our model is exactly the same as the output of the xor gate, that is to say, we can realize the xor gate by the combination of the AND gate, the nand gate and the OR gate.

And we have all realized and gate, and not gate, or gate, based on this, we can very simple implementation of xOR gate, its implementation process is as follows:

  1. Input four sets of and gate signals, and train the parameter sets w11, w12, b1w_{11}, w_{12}, b_{1}w11, w12, b1 that satisfy and gate
  2. Input four sets of signals of nand gate, and train the parameter sets w21, w22, b2w_{21}, w_{22}, b_{2}w21, w22, b2 satisfying the Nand gate
  3. Input four sets of signals of or gate, and train the parameter set w31, w32, b3w_{31}, w_{32}, b_{3}w31, w32, b3 satisfying or gate
  4. Based on the three groups of parameter sets trained above, the output of xOR gate is finally obtained according to the above model structure

The demonstration results of the above process are as follows:

Xor gate problem solving complete code:

import numpy as np
from matplotlib import pyplot as plt

%matplotlib inline

"" Author: Taoye wechat public number: Coder Explain: visualizations of data Parameters: X_data y_label: tags w_1, W_2: weights b: bigotry ""
def show_result(x_data, y_label, w_1, w_2, b) :
    plt.scatter(x_data[:, 0], x_data[:, 1], c = y_label, cmap = plt.cm.copper, linewidths = 10)
    line_x_1 = np.linspace(0.1.2.100)
    line_x_2 = (-b - w_1 * line_x_1) / w_2
    plt.plot(line_x_1, line_x_2)
    plt.show()

"" Author: Taoye wechat public id: Coder Explain: X_data: attributes of data w: weight vector b: paranoia Return: result of perceptron processing, in the form of a vector """
def out_result(x_data, w, b) :
    result = np.matmul(x_data, np.mat(w).T) + b
    y = result > 0
    return y.astype(np.int)

"" Author: Taoye 官 信 号: Coder Explain update iteration w, b Parameters: x_data y_label: data label max_iter: Maximum number of iterations learning_rate: learning rate W: weight parameter b: bias parameter Return: final w, b parameter """
def train(x_data, y_label, max_iter, learning_rate, w, b) :
    for i in range(max_iter):
        result = out_result(x_data, w, b)
        delta = np.mat(y_label).T - result
        w = (w + (learning_rate * np.matmul(x_data.T, delta)).T)
        b += (learning_rate * delta).sum(a)return w, b

if __name__ == "__main__":
    x_data = np.array([[0.0], [1.0], [0.1], [1.1]])
    and_label = np.array([0.0.0.1])    # With the gate output signal
    no_and_label = np.array([1.1.1.0])    # Output signal of nand gate
    or_label = np.array([0.1.1.1])    # or gate output signal
    
    w_1, b_1 = train(x_data, and_label, 10.0.1, np.array([0.0]), 0)    # The perceptron is trained out of the gate's parameter set
    w_2, b_2 = train(x_data, no_and_label, 10.0.1, np.array([0.0]), 0)    # The perceptron trains out the parameter set of the nand gate
    w_3, b_3 = train(x_data, or_label, 10.0.1, np.array([0.0]), 0)    The perceptron trains out or gate parameter sets
    
    no_and_predict = out_result(x_data, w_2, b_2)    # Results of the Nand gate perceptron model
    or_predict = out_result(x_data, w_3, b_3)    # or the result of the door perceptron model
    xor_predict = out_result(np.concatenate((no_and_predict, or_predict), axis = 1), w_1, b_1)    # Results of xOR gate perceptron model
    
    xor_label = np.array([0.1.1.0])
    for index in range(xor_label.size):    # Verification of multi-layer perceptron solution or problem
        print("%d or %d = %d, % (x_data[index, 0], x_data[index, 1], xor_predict[index]),
             "%d == %d?, " % (xor_label[index], xor_predict[index]),
             xor_label[index] == xor_predict[index])
Copy the code

Perceptron is a very simple algorithm, I believe that readers will be able to quickly understand its structure. Simple as it may be, it’s the stepping stone, or foundation, to the “temple of neural networks,” which is why Taoye is using perceptrons as the door to deep learning.

A quick summary of this article:

We started with an introduction to what a perceptron is, an intuitive look at the basic structure of a perceptron and what it can do. Secondly, the realization principle of and gate is explained in detail through the perceptron, and the training process of parameters is analyzed through mathematical formulation (gradient descent algorithm). After that, the logic circuit of and gate is realized in the form of code, and the realization result of and and or gate is finally obtained in the same way. In addition, at the end of the second section, we also throw out the problem that the single-layer perceptron cannot solve the or door, thus providing an “introduction” to the third section. Finally, we find that the xOR gate can be realized in the form of logic circuit combination. In the model structure, that is, a multi-layer perceptron is used to solve the or problem. Finally, the real validity of the model is verified by the code.

That’s all for this article.

I am still very sorry. I have been busy with other things recently. Blue English ~,

However, Taoye is a kind of conscience, right?

Still be to speak code of de?? Let’s make the perceptron arrangement very clear.

Neural networks are basically the same as the perceptron described above, but the activation function is different. The activation function used in the perceptron is a step function, and the activation function used in the neural network needs to be defined differently according to the actual problem. In addition, some other extensions have been made in the neural network.

For details, please see the breakdown in the next chapter.

I am Taoye, love study, love to share, is keen on all kinds of technology, the study of anime like playing chess, listening to music, chat, hoping to worlds to record your growth process as well as the life intravenous drip, also hope to be able to strong more within the circle of like-minded friends, more welcome visiting WeChat princess: cynicism Coder.

With the handover to 2021 just around the corner, it’s time to start preparing for the year-end review.

I’ll see you next time. Bye

References:

[1], “introduction to deep learning: based on the theory and the implementation of the Python” : Kang Yi saito People’s posts and telecommunications press [2] zero introductory deep learning – perceptron: https://www.zybuluo.com/hanbingtao/note/433855

Recommended reading

Taoye Taoye Machine Learning in Action Machine Learning in Action Taoye Taoye Machine Learning in Action Machine Learning in Action — female students asked Taoye, How KNN should play to beat Machine Learning in Action — both know and don’t know. Nonlinear support vector Machine Machine Learning in Action Machine Learning in Action, Taoye, takes a look at support vector machines Optimization of SMO “Machine Learning in Action” — analysis of support vector machines, one-hand tearing linear SVM