5.3 Unary linear regression practice

5.3.1 Simple example of unary linear regression

In the previous article, we introduced the example of housing price prediction, and the final model is like, so the author will first implement the simplest algorithm, using the least square method to solve the calculation, which has been derived in the previous article.



For coding purposes, use the above formula. Here’s the code.

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

matplotlib.rcParams['font.sans-serif'] = ['SimHei']
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['axes.unicode_minus'] = False # is used to display the minus sign normally

Parameters: x,y Returns: ws -
def fitSLR(x,y) :
    Data set length
    n=len(x)
    dinominator = 0
    numerator=0
    
    for i in range(0,n):
        numerator += (x[i]-np.mean(x))*(y[i]-np.mean(y))
        dinominator += (x[i]-np.mean(x))**2
        
    #print("numerator:"+str(numerator))
    #print("dinominator:"+str(dinominator))
    
    b1 = numerator/float(dinominator)
    b0 = np.mean(y)/float(np.mean(x))
    
    ws = np.mat([[b0], [b1]])
    # return coefficient
    return ws 


Parameters: XMAT-x dataset YMAT-Y dataset Returns: WS - regression coefficient ""
def MatfitSLR(xArr, yArr) :
    # convert to MAT matrix
    xMat = np.mat(xArr); yMat = np.mat(yArr)
        
    xTx = xMat.T * xMat     # Calculate the regression coefficient according to the formula derived in this paper
    if np.linalg.det(xTx) == 0.0:
        print("The matrix is singular and cannot be inverted.")
        return
    ws = xTx.I * (xMat.T*yMat)

    print(ws)
    
    return ws

Parameters: x,b0, B1 Returns: ""
# y= b0+x*b1
def prefict(x, ws1) :
    return ws[0] + x*ws[1]

Parameters xArr, yArr, WS Returns: None ""
def plotRegression(xArr, yArr,ws) :
    Create xMat matrix
    xMat = np.mat(xArr)                                                    
    # create yMat matrix
    yMat = np.mat(yArr)                                                    
    # Deep copy xMat matrix
    xCopy = xMat.copy()                                                    
    
    xCopy.sort(0)  # sort
                
    # Compute the corresponding y value
    yHat = xCopy * ws                                                     
   
    fig = plt.figure()
    
    # add subplot
    ax = fig.add_subplot(111)                                            
    
    # Draw a regression curve
    ax.plot(xCopy[:, 1], yHat, c = 'red')       
    
    ax.scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue',alpha = . 5)               
    # Draw sample points
    plt.title('DataSet')     # drawing title
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()

# test
if __name__ == '__main__':
    ## Step 1: load data...
    print("Step 1: load data...")
    x = [1.3.2.1.3]
    y = [14.24.18.17.27]
    
    xx = np.mat(x).T    
    m, n = xx.shape
    # connect the first column of the matrix 1 to the original X
    X = np.concatenate((np.ones((m,1)), xx), axis=1)   
    #print(X) 
    
    Create xMat matrix
    xMat = np.mat(X)                                                    
    # create yMat matrix
    yMat = np.mat(y).T
    #print(xMat)
    #print(yMat)
    
    ## Step 2: training...
    print("Step 2: training...")
    ws = fitSLR(x, y)
    # Matrix solution parameters
    #ws = MatfitSLR(xMat,yMat)
    
    ## Step 3: testing...
    print("Step 3: testing...")
    y_predict = prefict(6, ws)
    
    ## Step 4: show the plot...
    print("Step 4: show the plot...")
    plotRegression(X, y ,ws)
    
    ## Step 5: show the result...
    print("Step 5: show the result...")
    print("y_predict:"+str(y_predict))
Copy the code

The results are shown below.

LR_simple\ lr_simple.py

This example is relatively simple, and I don’t want to go into details, but I will use gradient descent.

5.3.2 Unary linear regression example details

First we need to load the data set. Before loading the data set, let’s look at the contents of the data set.



The first column is all 1.0, or x0. The second column is x1, the X-axis data. The third column is x2, or Y-axis data. So let’s plot the data first. Let’s look at the distribution. Write the code as follows.

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

matplotlib.rcParams['font.sans-serif'] = ['SimHei']
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['axes.unicode_minus'] = False # is used to display the minus sign normally

Returns: xarr-x dataset yarr-y dataset ""
def loadDataSet(filename) :
    X = []
    Y = []
    with open(filename, 'rb') as f:
        for idx, line in enumerate(f):
            line = line.decode('utf-8').strip()
            if not line:
                continue
                
            eles = line.split()
            if idx == 0:
                numFeature = len(eles)
            Convert data to float
            eles = list(map(float, eles)) 
            
            # feature, append(list)
            X.append(eles[:-1])   
            # The last column is the actual value, ditto
            Y.append([eles[-1]])    
     
    # Convert X,Y lists into matrices
    xArr = np.array(X)
    yArr = np.array(Y)
    
    return xArr,yArr

Parameters: no Returns: no ""
def plotDataSet() :
    Load the data set
    xArr, yArr = loadDataSet('data.txt')									
   
    n = len(xArr)	# number of data
    xcord = []; ycord = []		# sample points
    
    for i in range(n):													
        xcord.append(xArr[i][1]); ycord.append(yArr[i])	# sample points

    fig = plt.figure()
    ax = fig.add_subplot(111)	# add subplot
    # Draw sample points
    ax.scatter(xcord, ycord, s = 20, c = 'blue',alpha = . 5)	
			
    plt.title('DataSet')		# drawing title
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()
# test
if __name__ == '__main__':
    plotDataSet()
Copy the code

The results are shown below.

Figure 5

By visualizing the data, we can see how the data is distributed. Next, define cost function as follows:

J(θ 0, θ 1) = 1 2 m ∑ I = 1 m (h θ (x (I)) − y (I)) 2 J(\theta_0, \ theta_1) = \ frac {1} {2} m. \ sum_ {I = 1} ^ {m} (h_ {\ theta} (x ^ {(I)}) – y ^ {} (I)) ^ 2 J (theta 0, theta 1) = 2 m1 ∑ I = 1 m (h theta – y (x (I)) (I)) 2

while

Theta (h – y (x (I)) (I)) = 2 (h theta – y (x (I)) (I)) T ⋅ theta (h – y (x (I)) (I)) (h_{\theta}(x^{(i)})-y^{(i)})^2 = (h_{\theta}(x^{(i)})-y^{(i)})^T \cdot (h_{\theta}(x^{(i)})-y^{(i)}) Theta (h – y (x (I)) (I)) = 2 (h theta – y (x (I)) (I)) T ⋅ theta (h – y (x (I)) (I))

That [(y ^ (1) – (1) y) (y ^ (2) – (2)) y… (y ^ (m) – y (m))] ⋅ [(y ^ (1) – (1) y) (y ^ (2) – y (2)) ⋮ (y ^ (m) – y (m))] {\ begin {bmatrix} (\ hat {} y ^ {} (1) – y ^ {} (1)) & amp; (\hat{y}^{(2)}-y^{(2)}) & \cdots & (\hat{y}^{(m)}-y^{(m)}) \\ \end{bmatrix}} \cdot {\begin{bmatrix} (\hat{y}^{(1)}-y^{(1)}) \\ (\hat{y}^{(2)}-y^{(2)}) \\ \vdots \\ (\hat{y}^{(m)}-y^{(m)}) \\ \end{bmatrix}} [(y ^ (1) – (1) y) (y ^ (2) – (2)) y… (y ^ (m) – y (m))] ⋅ ⎣ ⎢ ⎢ ⎢ ⎡ (y ^ (1) – (1) y) (y ^ (2) – (2)) y ⋮ (y ^ (m) – y (m)) ⎦ ⎥ ⎥ ⎥ ⎤

The Python code is as follows.

Parameters: theta, X, Y Returns: ""
def J(theta, X, Y) :
    Define the cost function
    m = len(X)
    return np.sum(np.dot((predict(theta,X)-Y).T, (predict(theta,X)-Y))/(2 * m))
Copy the code

Next define the gradient descent (BGD) formula:

Theta 0: Theta ∑ 0 – alpha 1 m = I = 1 m (h theta – y (x (I)) (I)) \ theta_0: = \ theta_0 – \ alpha \ frac {1} {m} \ sum_ {I = 1} ^ {m} (h_ \ theta (x ^ {(I)}) – y ^ {(I)}) theta 0: = theta 0 – alpha m1 ∑ I = 1 m (h theta – y (x (I)) (I)) theta. 1: = theta – alpha 1 m ∑ I = 1 m (h theta – y (x (I)) (I)) ⋅ x \ theta_1: (I) = \ theta_1 – \ alpha \ frac {1} {m} \ sum_ {I = 1} ^ {m} (h_ \ theta (x ^ {(I)}) – y ^ {(I)}) \ cdot x ^ {(I)} theta: 1 = 1 – alpha theta m1 ∑ I = 1 m (h theta – y (x (I)) (I)) ⋅ x (I)

The Python code is shown below.

# -*- coding: utf-8 -*-
import numpy as np

Returns: xarr-x dataset yarr-y dataset ""
def loadDataSet(filename) :
    X = []
    Y = []
    with open(filename, 'rb') as f:
        for idx, line in enumerate(f):
            line = line.decode('utf-8').strip()
            if not line:
                continue
                
            eles = line.split()
            if idx == 0:
                numFeature = len(eles)
            Convert data to float
            eles = list(map(float, eles)) 
            
            # feature, append(list)
            X.append(eles[:-1])   
            # The last column is the actual value, ditto
            Y.append([eles[-1]])    
     
    # Convert X,Y lists into matrices
    xArr = np.array(X)
    yArr = np.array(Y)
    
    return xArr,yArr

Parameters: theta, X, Y Returns: ""
def J(theta, X, Y) :
    Define the cost function
    m = len(X)
    return np.sum(np.dot((predict(theta,X)-Y).T, (predict(theta,X)-Y))/(2 * m))

Parameters: alpha, maxloop, Epsilon, X, Y Returns: Theta, costs, theTAS
def bgd(alpha, maxloop, epsilon, X, Y) :
    Define the gradient descent formula, where alpha is the learning rate control step, maxloop is the maximum number of iterations, and Epsilon is the threshold control iteration (judgment convergence).
    m, n = X.shape # m is the sample number and n is the feature number, which is 2 here

    The initialization parameter is zero
    theta = np.zeros((2.1))
    
    count = 0 # Number of iterations
    converged = False # Convergence flag
    cost = np.inf The initialization cost is infinite
    costs = [] Record the generation value of each iteration
    thetas = {0:[theta[0.0]], 1:[theta[1.0]]} # Record each round of Theta updates
    
    while count<= maxloop:
        if converged:
            break
        # update theta
        count = count + 1
        
        # Separate calculation
        #theta0 = theta[0,0] -alpha/m * (predict(theta, X) -y).sum()
        # theta1 = theta (1, 0] - alpha/m * (np) dot (X [: 1] [: np. Newaxis]. J T, (h (theta, X) - Y))). The sum () # key note
        # sync update
        #theta[0,0] = theta0
        # theta (1, 0] = theta1
        #thetas[0].append(theta0)
        #thetas[1].append(theta1)
        
        # Count together
        theta = theta - alpha / (1.0 * m) * np.dot(X.T, (predict(theta, X)-Y))
        # X.T : n*m , h(theta, Y) : m*1 , np.dot(X.T, (predict(theta, X)- Y)) : n*1
        # sync update
        thetas[0].append(theta[0])
        thetas[1].append(theta[1])        
        
        # update current cost
        cost = J(theta, X, Y)
        costs.append(cost)
        
        # If convergent, no more iteration
        if cost<epsilon:
            converged = True
    return theta, costs, thetas 


# test
if __name__ == '__main__':
    X, Y = loadDataSet('data.txt')
    alpha = 0.02 Vector #
    maxloop = 1500 # Maximum number of iterations
    epsilon = 0.01 # Convergence judgment condition

    resault = bgd(alpha, maxloop, epsilon, X, Y)
    theta, costs, thetas = resault  The optimal argument is stored in theta, costs holds the generation value of each iteration, and TheTAS holds the theta value of each iteration update
    print(theta)
Copy the code

The results are shown below.

In the previous article, we can also calculate the regression coefficient vector according to the regression coefficient calculation method deduced above. The next step is to define the model function for prediction:

Theta h (x) = theta. Theta 0 + 1 x h_ \ theta (x) = \ \ theta_1 theta_0 + x h theta (x) = theta. Theta 0 + 1 x

Where hθ(x) h_\theta(x) hθ(x) is a model function, which is used to predict; θ0 \ theTA_0 θ0, θ1 \ theTA_1 θ1 are the model parameters, and θ0+θ1x \theta_0 + \theta_1 x θ0+θ1x are the actual eigenvalues. Can be seen as theta 0 1 x x x 1 + theta (I), I = 1, 2,…, m \ theta_0 \ \ theta_1 \ times times1 + x ^ {(I)}, I = 1, 2,… , m theta 0 1 x x x 1 + theta (I), I = 1, 2,… ,m, indicates a total of m,m,m samples, namely:

⋅ [θ 0 θ 1] = [y (1) y (2) ⋮ y (m)] {\begin{bmatrix} 1 &amp; x^{(1)} \\ 1 &amp; x^{(2)} \\ \vdots &amp; \vdots \\ 1 &amp; x^{(m)} \\ \end{bmatrix}}\cdot {\begin{bmatrix} \theta_0 \\ \theta_1 \\ \end{bmatrix}} = {\begin{bmatrix} y^{(1)} \\ Y ^ {(2)} \ \ \ vdots \ \ y ^ {(m)} {bmatrix}} \ \ \ end ⎣ ⎢ ⎢ ⎢ ⎡ 11 ⋮ 1 (1) (2) ⋮ x x x (m) ⎦ ⎥ ⎥ ⎥ ⎤ ⋅ [theta. Theta 0 1] = ⎣ ⎢ ⎢ ⎢ ⎡ (1) (2) ⋮ y y y (m) ⎦ ⎥ ⎥ ⎥ ⎤

The Python code is as follows.

Parameters: theta, X Returns: Returns the result after the prediction.
def predict(theta, X) :
    return np.dot(X, theta)  At this point, X is the processed X
Copy the code

Draw regression curve according to regression coefficient vector.

Parameters xArr, yArr, WS Returns: None ""
def plotRegression(xArr, yArr, ws) :
    
    xMat = np.mat(xArr)    Create xMat matrix
    yMat = np.mat(yArr)    # create yMat matrix
    xCopy = xMat.copy()          # Deep copy xMat matrix
    xCopy.sort(0)      # sort
    yHat = xCopy * ws      # Compute the corresponding y value
    fig = plt.figure()
    ax = fig.add_subplot(111)    # add subplot
    ax.plot(xCopy[:, 1], yHat, c = 'red')   # Draw a regression curve
    ax.scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue',alpha = . 5) # Draw sample points
    plt.title('DataSet')     # drawing title
    plt.xlabel('X')
    plt.show()
Copy the code

The results are shown below.

Figure 2

Plot the cost curve.

Parameters X, Y, costs,theta Returns: none
def plotloss(X, Y, costs, theta) :
    
    # Below are the predicted values of the training set
    XCopy = X.copy()
    XCopy.sort(0)  # axis=0 indicates in-column sorting
    #print(XCopy[:,1].shape, yHat.shape, theta.shape)
    
    Find the maximum and minimum values of the generation value to facilitate the control of the Y-axis range
    print(np.array(costs).max(),np.array(costs).min())
    
    plt.xlim(-1.1600) # maxloop for 1500
    plt.ylim(4.20)  
    plt.xlabel(U 'Number of iterations')
    plt.ylabel(U prime cost function J prime)
    plt.plot(range(len(costs)), costs)
Copy the code

The results are as follows.

Figure 3

Plot the gradient descent process as follows.

Parameters X, Y, theta Returns: none ""
def plotbgd(X, Y, theta) :
    
    View the range of theta0
    print(np.array(thetas[0]).min(), np.array(thetas[0]).max()) 
    
    See the range of theta1
    print(np.array(thetas[1]).min(), np.array(thetas[1]).max())  
    
    Prepare grid data for gradient descent
    # matplotlib
    size = 100
    theta0Vals = np.linspace(-10.10, size)
    theta1Vals = np.linspace(-2.4, size)
    JVals = np.zeros((size, size))   Initialize JVals to 0 (theta0Vals and theta1Vals
    for i in range(size):
        for j in range(size):
            col = np.matrix([[theta0Vals[i]], [theta1Vals[j]]])
            JVals[i,j] = J(col, X, Y)
    
    theta0Vals, theta1Vals = np.meshgrid(theta0Vals, theta1Vals)
    JVals = JVals.T
    
    # Draw 3D cost function graph
    contourSurf = plt.figure()
    ax = contourSurf.gca(projection='3d')
    
    ax.plot_surface(theta0Vals, theta1Vals, JVals,  rstride=2, cstride=2, alpha=0.3,
                    cmap=matplotlib.cm.rainbow, linewidth=0, antialiased=False)
    ax.plot(theta[0], theta[1].'rx')
    ax.set_xlabel(r'$\theta_0$')
    ax.set_ylabel(r'$\theta_1$')
    ax.set_zlabel(r'$J(\theta)$')
Copy the code

The results are shown below.

Figure 4.

To draw the contour diagram of the cost function, the Python code is as follows.

Parameters X, Y Returns: none ""
def plotinline(X, Y) :
    
    Prepare grid data for gradient descent
    size = 100
    theta0Vals = np.linspace(-10.10, size)
    theta1Vals = np.linspace(-2.4, size)
    JVals = np.zeros((size, size))   Initialize JVals to 0 (theta0Vals and theta1Vals
    for i in range(size):
        for j in range(size):
            col = np.matrix([[theta0Vals[i]], [theta1Vals[j]]])
            JVals[i,j] = J(col, X, Y)
    
    theta0Vals, theta1Vals = np.meshgrid(theta0Vals, theta1Vals)
    JVals = JVals.T
    
    Draw the contour map of the cost function
    #matplotlib inline
    plt.figure(figsize=(12.6))
    CS = plt.contour(theta0Vals, theta1Vals, JVals, np.logspace(-2.3.30), alpha=75.)
    plt.clabel(CS, inline=1, fontsize=10)
    
    # Draw the optimal solution
    plt.plot(theta[0.0], theta[1.0].'rx', markersize=10, linewidth=3)
    
    # Plot gradient descent
    plt.plot(thetas[0], thetas[1].'rx', markersize=3, linewidth=1) Theta is chosen each time
    plt.plot(thetas[0], thetas[1].'r-',markersize=3, linewidth=1) # Connect it with a string
Copy the code

The results are shown below.

Figure 5

The above is the test of regression data, so how to judge the fitting effect of the fitting curve? Of course, we can make observations based on our own experience and, in addition, compare the correlation between predicted and true values. Write the code as follows.

Parameters X, Y Returns: ""
def computeCorrelation(X, Y) :
    xBar = np.mean(X)
    yBar = np.mean(Y)
    SSR = 0
    varX = 0
    varY = 0
    for i in range(0 , len(X)):
        diffXXBar = X[i] - xBar
        diffYYBar = Y[i] - yBar
        SSR += (diffXXBar * diffYYBar)
        varX +=  diffXXBar**2
        varY += diffYYBar**2
    
    SST = math.sqrt(varX * varY)
    return SSR / SST
Copy the code

Correlations can be calculated using NUMpy’s CorrCOef method. The results are as follows.

LR\ lr_v1.0 \ lr_v1.0. Py The best fitting line method models the data as a straight line and has a very good performance. However, it seems that the fitting is not ideal in the data in Figure 6. We can adjust the prediction locally according to the data, and this method will be introduced below.

 locally weighted linear regression

One problem with linear regression is that it is possible to underfit because it is an unbiased estimate with minimum mean square error. Obviously, if the model is not fit, it will not get the best prediction effect. So some methods allow some bias to be introduced into the estimate to reduce the mean square error of the forecast.

One of these methods is Locally Weighted Linear Regression (LWLR). In this algorithm, we assign a certain weight to each point near the point to be predicted. Then a general regression is performed on this subset based on the least-mean-square deviation. Like kNN, this algorithm needs to select the corresponding data subset in advance for each prediction. The algorithm solved the regression coefficient θ \theta θ in the form of:

θ^=(XTWX)TXTWY \hat{\theta}=(X^TWX)^TX^TWY θ^=(XTWX)TXTWY where W W W W is a matrix used to assign weight to each data point. LWLR uses a “kernel” (similar to the kernel in support vector machines) to assign higher weights to nearby points. The type of kernel can be selected freely. The most commonly used kernel is gaussian kernel. The weight of Gaussian check should be as follows:

In this way, we can write locally weighted linear regression according to the above formula. By changing the value of K, we can adjust the regression effect. The code is as follows:

Function description: draw multiple local weighted regression curves.
def plotlwlrRegression(xArr, yArr) :

	yHat_1 = lwlrTest(xArr, xArr, yArr, 1.0)							# Calculate yHat according to locally weighted linear regression
	yHat_2 = lwlrTest(xArr, xArr, yArr, 0.01)							# Calculate yHat according to locally weighted linear regression
	yHat_3 = lwlrTest(xArr, xArr, yArr, 0.003)							# Calculate yHat according to locally weighted linear regression
	xMat = np.mat(xArr)													Create xMat matrix
	yMat = np.mat(yArr)													# create yMat matrix
	srtInd = xMat[:, 1].argsort(0)										# sort, return index value
	xSort = xMat[srtInd][:,0,:]
	fig, axs = plt.subplots(nrows=3, ncols=1,sharex=False, sharey=False, figsize=(10.8))										

	axs[0].plot(xSort[:, 1], yHat_1[srtInd], c = 'red')						# Draw a regression curve
	axs[1].plot(xSort[:, 1], yHat_2[srtInd], c = 'red')						# Draw a regression curve
	axs[2].plot(xSort[:, 1], yHat_3[srtInd], c = 'red')						# Draw a regression curve
	axs[0].scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue', alpha = . 5)				# Draw sample points
	axs[1].scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue', alpha = . 5)				# Draw sample points
	axs[2].scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue', alpha = . 5)				# Draw sample points

	# set the title,x label,y label
	axs0_title_text = axs[0].set_title(U 'locally weighted regression curve,k=1.0')
	axs1_title_text = axs[1].set_title(U 'locally weighted regression curve,k=0.01')
	axs2_title_text = axs[2].set_title(U 'locally weighted regression curve,k=0.003')

	plt.setp(axs0_title_text, size=8, weight='bold', color='red')  
	plt.setp(axs1_title_text, size=8, weight='bold', color='red')  
	plt.setp(axs2_title_text, size=8, weight='bold', color='red')  

	plt.xlabel('X')
	plt.show()

Parameters: testPoint - xarr-x dataset yarr-y dataset k-gaussian kernel k, custom Parameters: ws - regression coefficient ""
def lwlr(testPoint, xArr, yArr, k = 1.0) :

	xMat = np.mat(xArr); yMat = np.mat(yArr)
	m = np.shape(xMat)[0]
	weights = np.mat(np.eye((m)))	Create a weighted diagonal matrix
	for j in range(m):       # Traverse the dataset to calculate the weight of each sample
		diffMat = testPoint - xMat[j, :]     							
		weights[j, j] = np.exp(diffMat * diffMat.T/(-2.0 * k**2))
	xTx = xMat.T * (weights * xMat)										
	if np.linalg.det(xTx) == 0.0:
		print("The matrix is singular and cannot be inverted.")
		return
	ws = xTx.I * (xMat.T * (weights * yMat))	# Calculate the regression coefficient
	return testPoint * ws

Parameters for a locally weighted linear regression test: testArr - test data set Xarr-x data set Yarr-Y data set K-Gaussian kernel k, custom Parameters: WS - regression coefficient
def lwlrTest(testArr, xArr, yArr, k=1.0) :  
    m = np.shape(testArr)[0]		Calculate the size of the test dataset
    yHat = np.zeros(m)	
    
    for i in range(m):			# Make predictions for each sample point
        yHat[i] = lwlr(testArr[i],xArr,yArr,k)
    return yHat
Copy the code

The results are shown below.

Figure 6.

As you can see, when k, k, k is smaller, the fit is better. But when k, k, k is too small, there will be overfitting, such as when k is equal to 0.003. LR\ lr_v1.1 \ lr_v1.py

 calls the Sklearn library to implement the linear regression code as follows.

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import math
from sklearn import linear_model

matplotlib.rcParams['font.sans-serif'] = ['SimHei']
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['axes.unicode_minus'] = False # is used to display the minus sign normally

Returns: xarr-x dataset yarr-y dataset ""
def loadDataSet(filename) :
    X = []
    Y = []
    with open(filename, 'rb') as f:
        for idx, line in enumerate(f):
            line = line.decode('utf-8').strip()
            if not line:
                continue
                
            eles = line.split()
            if idx == 0:
                numFeature = len(eles)
            Convert data to float
            eles = list(map(float, eles)) 
            
            # feature, append(list)
            X.append(eles[:-1])   
            # The last column is the actual value, ditto
            Y.append([eles[-1]])    
     
    # Convert X,Y lists into matrices
    xArr = np.array(X)
    yArr = np.array(Y)
    
    return xArr,yArr

Parameters: no Returns: no ""
def plotDataSet(xArr, yArr) :
    n = len(xArr)	# number of data
    xcord = []; ycord = []		# sample points
    
    for i in range(n):													
        xcord.append(xArr[i][1]); ycord.append(yArr[i])	# sample points

    fig = plt.figure()
    ax = fig.add_subplot(111)	# add subplot
    # Draw sample points
    ax.scatter(xcord, ycord, s = 20, c = 'blue',alpha = . 5)	
			
    plt.title('DataSet')		# drawing title
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()

Parameters: theta, X Returns: Returns the result after the prediction.
def predict(theta, X) :
    return np.dot(X, theta)  At this point, X is the processed X

Parameters X, Y, theta Returns: none ""
def plotRegression(X, Y, theta) :
    
    xMat = np.mat(X)    Create xMat matrix
    yMat = np.mat(Y)    # create yMat matrix
    xCopy = xMat.copy()          # Deep copy xMat matrix
    xCopy.sort(0)      # sort
    yHat = xCopy * theta      # Compute the corresponding y value
    
    fig = plt.figure()
    ax = fig.add_subplot(111)    # add subplot
    ax.plot(xCopy[:, 1], yHat, c = 'red')   # Draw a regression curve
    ax.scatter(xMat[:,1].flatten().A[0], yMat.flatten().A[0], s = 20, c = 'blue',alpha = . 5) # Draw sample points
    
    plt.title('DataSet')     # drawing title
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()

Parameters X, Y, theta Returns: none ""
def plotRegression_new(X, Y, theta) :
    
    # Below are the predicted values of the training set
    XCopy = X.copy()
    XCopy.sort(0)  # axis=0 indicates in-column sorting
    yHat = predict(theta, XCopy)

    #print(XCopy[:,1].shape, yHat.shape, theta.shape)
    
    # Draw a regression line
    plt.title('DataSet')		# drawing title
    plt.xlabel(u'x')
    plt.ylabel(u'y')
    # Draw a regression line
    plt.plot(XCopy[:,1], yHat,color='r')
    # Draw sample points
    plt.scatter(X[:,1].flatten(), Y.T.flatten())
    plt.show()

Parameters X, Y Returns: ""
def computeCorrelation(X, Y) :
    xBar = np.mean(X)
    yBar = np.mean(Y)
    SSR = 0
    varX = 0
    varY = 0
    for i in range(0 , len(X)):
        diffXXBar = X[i] - xBar
        diffYYBar = Y[i] - yBar
        SSR += (diffXXBar * diffYYBar)
        varX +=  diffXXBar**2
        varY += diffYYBar**2
    
    SST = math.sqrt(varX * varY)
    return SSR / SST

# test
if __name__ == '__main__':
    ## Step 1: load data...
    print("Step 1: load data...")
    X, Y = loadDataSet('data.txt')
    #print(X.shape)
    #print(Y.shape)
  
    ## Step 2: show plot...
    print("Step 2: show plot...")
    plotDataSet(X, Y)
    
    ## Step 3: Create linear regression object...
    print("Step 3: Create linear regression object...")
    #regr = linear_model.LinearRegression()
    
    # ridge regression
    regr = linear_model.Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None,
      normalize=False, random_state=None, solver='auto', tol=0.001)
    
    ## Step 4: training...
    print("Step 4: training...")
    regr.fit(X, Y)
    
    ## Step 5: testing...
    print("Step 5: testing...")
    X_predict = [1.18.945]
    # At this point, the parameters are learned and the model is set. To predict new instances, the following can be done
    Y_predict = regr.predict([X_predict])
    
    ## Step 6: show the result...
    print("Step 6: show the result...")
    Vector w = (w_1,... , w_p) as coef_, and defines w_0 as intercept_.
    print("coef_:"+str(regr.coef_))
    print("intercept_:"+str(regr.intercept_)) 
    print("Y_predict:"+str(Y_predict))
    
    w0 = np.array(regr.intercept_)
    w1 = np.array(regr.coef_.T[1:2])
    
    # merge into a matrix
    theta = np.array([w0,w1])
    print(theta)
    
    ## Step 7: show Regression plot...
    print("Step 7: show Regression plot...")
    #plotRegression(X, Y, theta)
    plotRegression_new(X, Y, theta)
    
    ## Step 8: math Pearson...
    print("Step 8: math Pearson...")
    xMat = np.mat(X)                                                    Create xMat matrix
    yMat = np.mat(Y)                                                 # create yMat matrix
    yHat = predict(theta, xMat)
    
    Pearson = computeCorrelation(yHat, yMat)
    print(Pearson)
Copy the code

Refer to Attachment 2.LR for the complete code\ LR – sklearn_v2 \ LR – sklearn_v2. 0. Py.

The results are shown below.

Reference Documents: English document link Chinese document link

English link Chinese link

Ok, now summarize the general method of regression: (1) collect data: use any method to collect data. (2) Data preparation: regression requires numerical data, and nominal data will be converted into binary data. (3) Data analysis: Drawing a two-dimensional visualization map of the data will help to understand and analyze the data. After the reduction method is adopted to obtain the new regression coefficient, the new fitting line can be drawn on the graph as a comparison. (4) Training algorithm: find regression coefficient. (5) Test algorithm: USE R2 or the fitting degree of predicted value and data to analyze the effect of the model. (6) Using algorithm: using regression, a value can be predicted when given input, which is an improvement of classification method, because continuous data can be predicted instead of discrete category labels.

[1] Wang, J., Wang, J., Et al. [2] Journal of Machine Learning and Machine Learning [3

Video: www.youtube.com/watch?v=gNh… www.youtube.com/watch?v=owI…

This chapter refers to the code click enter