The first problem

Linear regression approach

So we can do it by linear regression first

Let’s set up some data, train it, get a line

And then we can make a judgment

Can also be solved

But there are problems

When we retrain a model with this data

He’s going to have a problem, the complexity of the sample increases, and this method might not be good

So we need to use a new method, which is logistic regression

Logistic regression

So what do we do with logistic regression

We need to prepare a logistic regression equation

As the sample

explain

Since we are in this classification problem, we reduce the classification problem to a probability that he is in a certain category, then we judge him to be 1 when the probability is greater than 0.5, and 0 when the probability is less than 0.5.

How to use it

And we’re going to use these three numbers to figure out what we saw before, the drain of the pool

When x is equal to minus 20, 1 plus e to the 20 is very, very big, so 1 over 1 plus e to the 20 is very close to 0 and the rest of the data, if you take it in, you'll see that it fits the reality.Copy the code

That pool water is also a simple problem, and then look at the complex problem how to do

A complex problem

Let’s look at this case

So all of these are simple cases, so what do we do for a little bit more complicated cases

Let’s look at this problem

We can see that this classification, graphically, can be separated by a straight line

If we look at this line, we can see that the red area is above the line, and the blue area is below it

We can take the line, figure it out, and plug it in to the exponent of e

Like this,

And just to do that, let’s take this line

g((x,y)) = y -x - 1

If you plug the data into the logistic regression equation, you can figure out that if this point is above the line then e to the power will be positive, and if this point is below the line e to the power will be negative.

Is not found with the previous problem of water release

Now look at this situation

Let’s do the same thing. Let’s create a line that separates the two types of data

Let’s analyze that this is a circle with center 0,0 and radius 2

So g of (x,y) is equal to x squared plus y squared minus 4

Again, plug this g of x into the original logistic regression equation e to the exponent

And you get the logistic regression equation for the classification

Unified the

Let’s call that line that distinguishes the value or the classification of the point the decision boundary of this model. Okay

To summarize

Solve logistic regression model

Let me remind you of linear regression

First of all, in linear regression, we find a loss function.

Then, with the aim of finding the minimum point of the loss function, we adjust the loss function, give a step size, and perform gradient descent to know the convergence of the function.

Logistic regression

So we do the same thing with logistic regression, we find a loss function

But for our logistic regression, the actual numbers are 0 and 1. We would expect that if the prediction doesn’t match the reality, the loss function should be very large. And similarly, if they’re right, we want this value to be infinitely close to 0

Loss function solution

So let’s sort out our loss function

So with that function, we’re going to apply gradient descent

And for this task, let’s do it, so we’re essentially doing gradient descent for the boundary function. According to the shape of different boundary lines, it is determined to be a function of several times and several elements. Then, the data of a random point is given and adjusted according to the direction and step diameter of the gradient. The data is stored in a temporary data, and then the next point is found and compared with the temporary data until the two data are equal. I know the solution to this regression model.