The resources

  • Python3 introduction to machine learning

The article directories

The next three articles will be divided into three to complete the content of linear regression algorithm. This is the general outline of the three articles.

1. Simple linear regression

Take the housing price in Boston as an example. On this graph, the X-axis is the area of the house and the Y-axis is the price of the house. We assume that there is a linear relationship between the area of the house and the price of the house, so we can predict the housing price through the linear regression algorithm.

Note: the graphs here are different from those of the previous KNN algorithms.

We learned in high school that a line can be expressed as y = ax +b.

The linear regression algorithm assumes that there is a line that fits our data as well as possible to make predictions.

Now suppose we find the equation of the line that best fits:

y = ax+ b

For each of these sample points

According to our equation of the line, the predicted value is:

True value is:

Then, if this equation is the best fitting equation, then:

It has to be the smallest.

But we usually don’t use absolute value, but square, so our formula above can be written as:

Take our current goal, is to make:

As small as possible.

And because

So, now, the goal of linear regression is to find out what a and B do

Here we can summarize the basic idea of learning algorithm, which is the principle of optimization. The essence is to find some parameters to make the loss (in fact, the error) as small as possible, and to fit all data points as much as possible.

There is a special discipline for such ideas called the optimization principle, and there is another method called the convexity principle. Hope deep ploughing friends can understand.

So before we solve for a and B, we’re going to look at least square.

1.1 Least square method

Now we want to find the minimum value of a function, which is essentially this extreme value of this function. Those of you who have studied high numbers know that the easiest way to find the extreme value of a function is to take the derivative of each variable in the function, and the place where the derivative is 0 is the extreme value.

Knock on the board, knock on the board, that’s the core of the derivation.

So, now let’s take

Abbreviated as

, and take its extreme value, namely:

Well, let’s take the derivative with respect to b:

Step one: Take the derivative

Step 3: Solve for B

Having solved for b, let’s take the derivative of a:

Now I want to rearrange this so that A can be extracted separately:

Then extract a:

After finishing, we get the expression of A:

We solved for A, but it wasn’t easy enough, and now we have to transform it again.

With that in mind, we can transform the formula:

We ended up with a slightly better a:

The final equations of a and B are as follows:

1.2 Implementation of simple linear regression

Next, we’ll implement a simple linear regression algorithm in code.

This is defined as SimpleLinearRegression1 for a reason, because the algorithm will be optimized later. This is just the first version.

class SimpleLinearRegression1:

    def __init(self):
        self.a_ = None
        self.b_ = None

    def fit(self,x_train,y_train):
        assert x_train.ndim == 1."Simple Linear Regressor can only solve single data"
        assert len(x_train) == len(y_train),\
        "the size of x_train must be equal to the size of y_train"

        x_mean = np.mean(x_train)
        y_mean = np.mean(y_train)

        num = 0.0
        d = 0.0
        for x_i, y_i in zip(x_train, y_train):
            num += (x_i - x_mean) * (y_i - y_mean)
            d += (x_i - x_mean) ** 2

        self.a_ = num / d
        self.b_ = y_mean - self.a_ * x_mean

        return self

    def predict(self,x_predict):
        assert x_predict.ndim == 1, \"Simple Linear Regressor can only solve single feature data"
        assert self.a_ is not None and self.b_ is not None, \"must fit before predict"
        return np.array([self._predict(x) for x in x_predict])

    def _predict(self,x_single):
        return self.a_ * x_single + self.b_
Copy the code

Once you understand the above, this code is not too difficult. Now let’s test it out:

  1. Let’s fake the test data set:

  1. Call the algorithms we’ve written to model and predict

1.3 to quantify

Since it is inefficient to use the for loop, we need to convert the for loop into vector calculation.

You can view it as multiplying the following vectors:

And we know that if we define the two vectors w and v as the following:

At this point, the formula can be transformed into:

So we can use vector manipulation in NUMpy.

And now we know why we had to reduce a to this form, because it’s easy to convert to vectors.

1.4 Realization of vectorization algorithm

Next, we modify the linear regression algorithm previously written with the for loop:

Just change the for loop to multiply.

1.5 Performance Test

Next, we test the performance of two linear regression algorithms.

For a calculation of 100,000, each calculation using the for loop takes 1.1s, while the vector is only 20ms. It’s a big gap.


In the next article, regression algorithm will be completed to measure indicators related content, refueling ~

A long way to go before