Fundamentals of machine learning - Perceptrons

“This is the fifth day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

perceptron

Today I’m going to talk about an older linear classification model, but it’s also the basis of neural networks. Neural network can see the accumulation of a large number of perceptrons. Adding nonlinear activation functions to the linear model of perceptrons and structuring these simple classifiers can form neural network. In fact, neural network can be seen as an application of swarm intelligence from one side.

Linear and nonlinear problems

Linear separability describes two kinds of data sets. For two dimensional data set (two characteristics), if a straight line, can perfect to distinguish the two classification, then the data set is linearly separable, so for high-dimensional data if find a hyperplane can replace linear separate the data of high dimension space, then said that the data are linearly separable.

In the figure above, A represents linearly separable and B represents linearly indivisible

The reason why I want to talk about linearly separable is because perceptrons can only solve linearly separable problems.

data

There are samples like this

D = \{(x_1, y_1),(x_2, y_2),\cdots,(x_i, y_i) \}_N

Where XI ∈Rmx_i \in \mathbb{R}^mxi∈Rm data is an M-dimensional vector, and yi∈−1,+1y_i \in {-1,+1}yi∈−1,+1 is a 2-category, -1 and +1 respectively represent a category. And the whole reason why I chose minus 1 and plus 1 is because it’s easy to calculate.

Perceptron model

First of all, the perceptron is an error-driven model, and the model is simply to find a partition line or partition hyperplane like wTxw^TxwTx. Where W ∈Rmw\in \mathbb{R}^ MW ∈Rm is also a vector, that is, to find a line or plane that can divide two types of data.

Perceptron objective function

L(w) = \sum_{x_i \in D} -y_iw^Tx_i

First, we randomly initialize w, and then update the parameter of each point (xi,yi)(x_i,y_i)(xi,yi) if the following conditions are met, that is, we update w if the two conditions are wrong

w^Tx_i > 0 \\ y_i =-1

This is the update, which is when wTxi>0w^Tx_i >0 wTxi>0 normally yiy_iyi should be +1 which is above the partition hyperplane. That means w can’t classify the sample correctly.

w_{new} = w_{old} – \eta x_i

(w-x_i)^Tx_i\\ w^Tx – ||x_i||^2

And segmentation hyperplane is superimposed direction the ∣ ∣ xi ∣ ∣ 2 | | x_i | | ^ 2 ∣ ∣ xi ∣ ∣ 2

The second case I won’t explain too much here

w^Tx_i < 0 \\ y_i =+1

This is on the update

w_{new} = w_{old} + \eta x_i

Until all of the sample points are correctly classified, and you exit the loop, the perceptron’s idea is pretty simple. This algorithm does not look at all the samples globally, but looks at each sample one by one. In fact, we usually learn this way to learn.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Fundamentals of machine learning – Perceptrons

perceptron

Linear and nonlinear problems

data

Perceptron model

Perceptron objective function

Fundamentals of machine learning – Perceptrons

perceptron

Linear and nonlinear problems

data

Perceptron model

Perceptron objective function

Related Posts

Linux tar packages, compresses, and unextracts to the specified directory

Standardization construction and application of takeaway goods

Top 10 Python Open Source Projects of September 2018