This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together.

Logistic regression

At the beginning of the training, a logistic regression model will be built to predict whether a student will be admitted to college. Imagine that you are the administrator of the relevant section of the university and want to determine whether students will be accepted or not by their scores on two tests. You now have a set of training samples from previous applicants that you can use to train logistic regression. For each training sample, you have their two test scores and the final admission results. To accomplish this prediction task, we are going to build a classification model that can evaluate the likelihood of admission based on two test scores.

Import the required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Copy the code

Import ex2data1 into data
path="ex2data1.txt"
data=pd.read_table(path,header=None,names=["Exam 1"."Exam 2"."Admitted"],sep=', ')
data.head()
Copy the code

.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	Exam 1	Exam 2	Admitted
0	34.623660	78.024693	0
1	30.286711	43.894998	0
2	35.847409	72.902198	0
3	60.182599	86.308552	1
4	79.032736	75.344376	1

Create a scatter diagram of Exam1,Exam2, and use colors to visualize (is the sample positive (accepted) or negative (not accepted)?)

Extract admitted and unadmitted data
positive=data[data["Admitted"].isin([1]]The #isin function is used to extract the corresponding row
negative=data[data["Admitted"].isin([0]]# drawing
fig,ax=plt.subplots(figsize=(9.6))# Set graphics size
ax.scatter(positive["Exam 1"],positive["Exam 2"],s=50,c="blue",marker="o",label="Admitted")     # Set point coordinates, size, color, graphic, label
ax.scatter(negative["Exam 1"],negative["Exam 2"],s=50,c="red",marker="x",label="Not Admitted")
ax.legend(loc=1)# Legend in the upper right corner
ax.set_xlabel("Exam 1 Score")
ax.set_ylabel("Exam 2 Score")
plt.show()

Copy the code

It can be seen from the figure above that the two graphs have a clear decision boundary directly. Next, logistic regression is implemented and a model is trained to predict the outcome.

$The sigmoid function$

GGG stands for sigmoidsigmoidsigmoid function: g(z)=11+e−zg(z)=\frac{1}{1+e^{-z}}g(z)=1+e−z1

Logistic regression model assumes that the function: h theta. Theta (Tx) (x) = g = 11 + e – theta Txh_ {theta} (x) = g (theta ^ {T} (x) = \ frac {1} {1 + e ^ {- theta ^ {T} x}} h theta (x) = g (theta Tx) = 1 + e – theta Tx1

Sigmoid: g(z)
def sigmoid(z) :
    return 1/ (1+np.exp(-z))
Copy the code

Verify the function of the g(z) function
fig,ax=plt.subplots(figsize=(8.6))
test=np.arange(-10.10,step=0.5)
ax.plot(test,sigmoid(test),c='red')
plt.show()
Copy the code

Sigmoid function: g(z) verification is successful, then write the cost function to evaluate the result.

$J (theta) = \ frac {1} {m} \ sum ^ {m} _ {I = 1} [- y ^ {(I)} the log (h_ {\ theta} (x ^ {(I)})) – (1 – y ^ {(I)}) log (1 – h_ {\ theta} (x ^ {(I)}))]$

ps:

Multiply (): Multiply the array by the corresponding position of the matrix, and the output is the same size as the multiplied array/matrix

@: Performs matrix multiplication on matrices

*: Multiply arrays by positions; Perform matrix multiplication on matrices

def cost(Theta,X,Y) :# pass X,Y is a table, Theta is an array
    # Convert X,Y from table to matrix,Theta from array to matrix
    Theta=np.matrix(Theta)
    X=np.matrix(X.values)
    Y=np.matrix(Y.values)
    first=np.multiply(-Y,np.log(sigmoid([email protected])))
    second=np.multiply(1-Y,np.log(1-sigmoid([email protected])))
    return np.sum(first-second)/len(X)
Copy the code

# Add vector x0(x0 is equal to 1)
data.insert(0."Ones".1)
Copy the code

Extract data X,Y,Theta
cols=data.shape[1]
X=data.iloc[:,0:cols-1]
Y=data.iloc[:,cols-1:cols]
Theta=np.zeros(3)
Copy the code

Calculate the cost function of the initialization parameter (Theta = 0)
cost(Theta, X, Y)
Copy the code

0.6931471805599453
Copy the code

Design functions to calculate gradients of training data, labels, and some parameters thata:

$gradient \ descent$ (Gradient descent)

Batch gradient Descent (BATCHGradientDescent)
Transform to vectorization

$\frac{\partial J(\theta)}{\partial \theta_{j}}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_{j}^{(i)}$

def gradient(Theta,X,Y) :# pass X,Y is a table, Theta is an array
    

    # Convert X,Y from table to matrix,Theta from array to matrix
    Theta=np.matrix(Theta)
    X=np.matrix(X.values)
    Y=np.matrix(Y.values)
    #grad records the gradient descent value of each element of θ vector
    Theta_cnt=Theta.shape[1]
    grad=np.zeros(Theta.shape[1])
    # Calculate the error vector
    error=sigmoid(X*Theta.T)-Y
    for i in range(Theta_cnt):
        tmp=np.multiply(error,X[:,i])
        grad[i]=np.sum(tmp)/len(X)
    return grad
Copy the code

The above function actually does not perform gradient descent, just calculates a gradient step size. Here is the result of gradient descent with initial data and parameter 0:

gradient(Theta, X, Y)
Copy the code

Array ([-0.1, -12.00921659, -11.26284221])Copy the code

Since you’re using Python, you can use SciPy’s “optimize” namespace to calculate the cost and gradient parameters

Most commonly used parameters:

Func: Objective function of optimization
X0: initial value
Fprime: Gradient function that provides the optimization function func (if cost function returns only cost, set fPrime =gradient), otherwise func must return the function value and gradient, or set approx_grad=True
Approx_grad: If set to True, an approximate gradient is given
Args: tuples that are parameters passed to the optimization function

Returns:

X: array, return the target value of the optimization problem
Nfeval: Integer, number of function evaluations (A function evaluation counts every time a target optimization function is called during optimization. There will be multiple function evaluation in an iteration. This parameter is not equal to the number of iterations, but is often greater than the number of iterations.
Rc: int,Return code, see below

SciPy's truncated Newton (TNC) implementation can find the optimal parameter
import scipy.optimize as opt 
result=opt.fmin_tnc(func=cost, x0=Theta,fprime=gradient,args=(X,Y))
result
Copy the code

(Array ([-25.16131863, 0.20623159, 0.20147149]), 36, 0)Copy the code

The calculation results of cost function in this conclusion are as follows:

cost(result[0],X,Y)
Copy the code

0.20349770158947458
Copy the code

Draw a decision curve

plot_x=np.linspace(30.100.100)
plot_y=( - result[0] [0] - result[0] [1] * plot_x) / result[0] [2]
# drawing
fig,ax=plt.subplots(figsize=(9.6))# Set graphics size
ax.plot(plot_x,plot_y,c='y',label='Prediction')
ax.scatter(positive["Exam 1"],positive["Exam 2"],s=50,c="blue",marker="o",label="Admitted")     # Set point coordinates, size, color, graphic, label
ax.scatter(negative["Exam 1"],negative["Exam 2"],s=50,c="red",marker="x",label="Not Admitted")
ax.legend(loc=1)# Legend in the upper right corner
ax.set_xlabel("Exam 1 Score")
ax.set_ylabel("Exam 2 Score")
plt.show()
Copy the code

The method of minimize can be used to fit, and the method of minimize can choose different algorithms to calculate

Most commonly used parameters:

Func: Objective function of optimization
X0: initial value, one-dimensional array, Shape (n,)
Args: tuple, optional, additional arguments passed to the optimization function
Method: algorithm for solving. If TNC is selected, it is similar to fMIN_tnc ()
Jac: a function that returns the gradient vector

Returns:

Optimize the result object.
X: Optimize the target array of the problem
Success: True indicates whether success is achieved. Failure information is displayed if success is not achieved.

result=opt.minimize(fun=cost, x0=Theta,args=(X,Y),method="TNC",jac=gradient)
result
Copy the code

Fun: 0.20349770158947458 JAC: Array ([8.95090947E-09, 8.17143290E-08, 4.7654271707]) message: 'Local minimum reached (|pg| ~= 0)' nfev: 36 nit: 17 status: 0 success: True x: Array ([25.16131863, 0.20623159, 0.20147149])Copy the code

The calculation results of cost function in this conclusion are as follows:

cost(result["x"],X,Y)
Copy the code

0.20349770158947458
Copy the code

After obtaining the parameter θ, the model is used to predict whether a student will be admitted.

Next, write a function that prints predictions for dataset X with the parameter θ. This function is then used to score the training accuracy of the classifier. Logistic regression model assumes that the function: h theta. Theta (Tx) (x) = g = 11 + e – theta Txh_ {theta} (x) = g (theta ^ {T} (x) = \ frac {1} {1 + e ^ {- theta ^ {T} x}} h theta (x) = g (theta Tx) = 1 + e – theta Tx1

When hθ>= 0.5H_ {θ}>=0.5hθ>=0.5, y=1y=1y=1;

When hθ< 0.5H_ {θ}< 0.5H θ<0.5, y=0y=0y=0 is predicted.

Let’s start building the prediction function predict

def predict(Theta,X) :
    p=sigmoid([email protected])
    return [1 if x>=0.5 else 0 for x in p]
Copy the code

result=opt.fmin_tnc(func=cost, x0=Theta,fprime=gradient,args=(X,Y))
Y=np.matrix(Y.values)
# Note that X and Y are always tables and need to be converted to matrices or lists
Copy the code

#result[0] is the learned θ
Theta_min = result[0]If the minimum is used, then the result[0] will change, and you need to execute fMIN_tnc again to get the required result[0].
predictions = predict(Theta_min, X)

correct = [1 if a==b else 0 for (a, b) in zip(predictions,Y)]
accuracy = float((sum(correct) / len(correct))*100)
print ("accuracy = {:.2f}%".format(accuracy))
Copy the code

Accuracy = 89.00%Copy the code

The logistic regression classifier predicted correctly if a student was admitted or not, achieving 89 percent accuracy, which is the accuracy of the training set. There is no real approximation held by Settings or cross-validation, so this number could be higher than its true value.

Regularized logistic regression

In the second part of the training, we will enhance the logistic regression algorithm by adding regular terms. Regularization is a term in cost function that makes the algorithm more in favor of the “simpler” model (in this case the model will have smaller coefficients). This theory helps to reduce overfitting and improve the generalization ability of the model.

Suppose you are a product manager at a factory and you have test results for some microchips in two different tests. From these two tests, you want to determine whether the microchip should be accepted or rejected. To help you decide, you have a data set of past microchip test results from which you can build a logistic regression model.

First, data extraction:

# Data extraction
path="ex2data2.txt"
data2=pd.read_table(path,sep=",",header=None,names=["Test 1"."Test 2"."Accepted"])
data2.head()
Copy the code

.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	Test 1	Test 2	Accepted
0	0.051267	0.69956	1
1	0.092742	0.68494	1
2	0.213710	0.69225	1
3	0.375000	0.50219	1
4	0.513250	0.46564	1

Graphs show data points

Accepted=1; Accepted=0
positive=data2[data2["Accepted"].isin([1])]
negative=data2[data2["Accepted"].isin([0]]Accepsted,Rejected
fix,ax=plt.subplots(figsize=(8.6))
ax.scatter(positive["Test 1"],positive["Test 2"],s=20,color="blue",marker="o",label="Accepsted")
ax.scatter(negative["Test 1"],negative["Test 2"],s=20,color="red",marker="x",label="Rejected")
ax.legend(loc=1)
ax.set_xlabel("Test 1 Score")
ax.set_ylabel("Test 2 Score")
plt.show()
Copy the code

This data seems too complex to be properly separated by a straight line.

Linear techniques such as logistic regression can be considered to construct features derived from polynomials of original features.

For details, see the features in the PDF: map these features to all x1 and x2 polynomial terms, up to the sixth power.

$mapFeature(x)= \begin{bmatrix} 1 \\ x_{1}\\ x_{2}\\ x_{1}^{2}\\ x_2^{2}\\ x_{1}x_{2} \\ x_1^3\\.\\.\\.\\x_1x_2^5\\ x_2^6\\ \end{bmatrix}$

# create a polynomial feature

Set the highest power to 6
degree=6

Extract the vectors x1,x2
x1=data2["Test 1"]
x2=data2["Test 2"]

Insert vector 1 into data2
data2.insert(3."Ones".1)

for i in range(1,degree+1) :for j in range(0,i+1):
        data2['F'+str(i)+str(j)]=np.power(x1,i-j)*np.power(x2,j)

Select * from Test1,Test2
data2.drop("Test 1",axis=1,inplace=True) Axis =1; The inplace argument is False by default, meaning that the original data is unchanged, and True means that the original data is changed.
data2.drop("Test 2",axis=1,inplace=True)

data2.head()
Copy the code

.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	Accepted	Ones	F10	F11	F20	F21	F22	F30	F31	F32	.	F53	F54	F55	F60	F61	F62	F63	F64	F65	F66
0	1	1	0.051267	0.69956	0.002628	0.035864	0.489384	0.000135	0.001839	0.025089	.	0.000900	0.012278	0.167542	1.815630 e-08	2.477505 e-07	0.000003	0.000046	0.000629	0.008589	0.117206
1	1	1	0.092742	0.68494	0.008601	0.063523	0.469143	0.000798	0.005891	0.043509	.	0.002764	0.020412	0.150752	6.362953 e-07	4.699318 e-06	0.000035	0.000256	0.001893	0.013981	0.103256
2	1	1	0.213710	0.69225	0.045672	0.147941	0.479210	0.009761	0.031616	0.102412	.	0.015151	0.049077	0.158970	9.526844 e-05	3.085938 e-04	0.001000	0.003238	0.010488	0.033973	0.110047
3	1	1	0.375000	0.50219	0.140625	0.188321	0.252195	0.052734	0.070620	0.094573	.	0.017810	0.023851	0.031940	2.780914 e-03	3.724126 e-03	0.004987	0.006679	0.008944	0.011978	0.016040
4	1	1	0.513250	0.46564	0.263426	0.238990	0.216821	0.135203	0.122661	0.111283	.	0.026596	0.024128	0.021890	1.827990 e-02	1.658422 e-02	0.015046	0.013650	0.012384	0.011235	0.010193

5 rows × 29 columns

Then modify the original cost function cost and gradient function.

$regularized cost$ (Regularized cost function)

$J (theta) = \ frac {1} {m} \ sum ^ {m} _ {I = 1} [- y ^ {(I)} the log (h_ {\ theta} (x ^ {(I)})) – (1 – y ^ {(I)}) log (1 – h_ {\ theta} (x ^ {(I)}))] + \ frac {\ lambda} {2 m }\sum_{j=1}^{n}\theta_j^2$

def costReg(Theta,X,Y,LearningRate) :# the passed X,Y,Theta is an array
    # Convert to matrix
    Theta=np.matrix(Theta)
    X=np.matrix(X)
    Y=np.matrix(Y)
    first=np.multiply(-Y,np.log(sigmoid([email protected])))
    second=np.multiply(1-Y,np.log(1-sigmoid([email protected])))
    # regex:
    reg=(LearningRate/(2*len(X)))*np.sum(np.power(Theta[:,1:Theta.shape[1]],2))
    return np.sum(first-second)/len(X)+reg
Copy the code

If gradient descent is used to minimize the cost function, since θ0\theta_0θ0 is not regularized, the gradient descent algorithm will be divided into two cases:

$\theta_0:=\theta_0-\alpha\frac{1}{m}\sum_{i=1}^{m}[h_{\theta}(x^{(i)})-y^{(i)}]x_0^{(i)}$
$\theta_j:=\theta_j-\alpha\frac{1}{m}\sum_{i=1}^{m}[h_{\theta}(x^{(i)})-y^{(i)}]x_j^{(i)}+\frac{\lambda}{m}\theta_j=\thet a_j(1-\alpha\frac{\lambda}{m})-\alpha\frac{1}{m}\sum_{i=1}^{m}[h_{\theta}(x^{(i)})-y^{(i)}]x_j^{(i)}\ \ The formula after alpha is the partial derivative (gradientReg)$

def gradientReg(Theta,X,Y,LearningRate) :# the passed X,Y,Theta is an array
    # Convert X,Y,Theta from array to matrix
    Theta=np.matrix(Theta)
    X=np.matrix(X)
    Y=np.matrix(Y)

    #grad records the gradient descent value of each element of θ vector
    Theta_cnt=Theta.shape[1]
    grad=np.zeros(Theta.shape[1])

    error=sigmoid(X*Theta.T)-Y
    for i in range(Theta_cnt):
        tmp=np.multiply(error,X[:,i])
        grad[i]=np.sum(tmp)/len(X)
    reg=(LearningRate/len(X))*Theta
    reg[0] =0 No regularization, no punishment for the 0th term
    return grad+reg

    # I wonder why the accuracy of the following is only 83.05% and the accuracy of the above is 84.75%
    # Calculate the error vector
    # error=sigmoid(X*Theta.T)-Y
    # for i in range(Theta_cnt):
    # tmp=np.multiply(error,X[:,i])
    # if i==0 :
    # grad[i]=np.sum(tmp)/len(X)
    # else:
    # grad[i]=np.sum(tmp)/len(X)+(LearningRate/len(X))*Theta[:,i]
    return grad
Copy the code

Initialize a variable:

cols=data2.shape[1]
Data2: Accepted x0 x1...
Y2=data2.iloc[:,0:1]
X2=data2.iloc[:,1:cols]

Table to array
X2=np.array(X2.values)
Y2=np.array(Y2.values)
Theta2=np.zeros(X2.shape[1])
Copy the code

Initial learning rate to a reasonable value

LearningRate=1
Copy the code

Try calling the new theta regularization function, which defaults to 0, to make sure the calculation works.

costReg(Theta2, X2, Y2, LearningRate)
Copy the code

0.6931471805599454
Copy the code

gradientReg(Theta2, X2, Y2, LearningRate)
Copy the code

Matrix ([8.47457627E-03, 1.87880932E-02, 7.77711864E-05, 5.03446395E-02, 1.15013308E-02, 3.76648474E-02, 1.83559872E-02, 7.32393391E-03, 8.19244468E-03, 2.34764889E-02, 3.93486234E-02, 2.23923907E-03, 1.28600503E-02, 3.09593720E-03, 3.93028171E-02, 1.99707467E-02, 4.32983232E-03, 3.38643902E-03, 5.8382207803, 4.47629067e-03, 3.10079849E-02, 3.10312442E-02, 1.09740238E-03, 6.31570797E-03, 4.08503006E-04, 7.26504316E-03, 1.37646175E-03, 3.87936363 e-02]])Copy the code

Use the same optimization function as in Part 1 to calculate the optimized result:

result2=opt.fmin_tnc(func=costReg, x0=Theta2,fprime=gradientReg,args=(X2,Y2,LearningRate))
result2
Copy the code

(Array ([1.60695456, 1.1560186, 1.96230284, -3.0506508, -1.65702971, -1.91905201, 0.57020964, -0.68153388, -0.71446988, 0.04581342, -2.05403849, -0.19543701, -1.06002879, -0.50146813, -1.49394535, 0.08870346, -0.37553871, -0.1621286, -0.47670397, -0.49928213, -0.25753424, -1.25322562, 0.00804809, -0.51945916, -0.03978315, -0.54273819, 0.21843762, 0.93050987]), 86, 4)Copy the code

Finally, the prediction function in Part 1 was used to check the accuracy of the scheme on the training data:

#result2[0] is the learned θ
Theta_min = result2[0]
predictions = predict(Theta_min, X2)
correct = [1 if a==b else 0 for (a, b) in zip(predictions,Y2)]
accuracy = float((sum(correct) / len(correct))*100)
print ("accuracy = {:.2f}%".format(accuracy))
Copy the code

Accuracy = 84.75%Copy the code

You can also use the advanced Python library Scikit-learn to solve this problem:

from sklearn import linear_modelCall the linear regression package of SkLearn
model = linear_model.LogisticRegression(penalty='l2', C=1.0)#(C: regularization coefficient. Float, by default 1.0, is the inverse of the regularization strength and must be a positive float, the smaller it is, the stronger the regularization.)
model.fit(X2, Y2.ravel())
Copy the code

LogisticRegression()
Copy the code

model.score(X2, Y2)
Copy the code

0.8305084745762712
Copy the code

The accuracy was not ideal before, maybe the parameters need to be adjusted. The reason is that when creating polynomial features, the highest order will affect the results, and there is always X2 due to my code problems, so X1 cannot exist alone, so the accuracy is very low.

Draw a decision curve

def hfun2(theta,x1,x2,degree) :
    temp=theta[0] [0]
    place=0
    for i in range(1,degree+1) :for j in range(0,i+1):
            temp+=np.power(x1,i-j)*np.power(x2,j)*theta[0][place+1]
            place+=1
    return temp
Copy the code

def find_decision_boundary(theta,degree) :
    t1 = np.linspace(-1.1.5.1000)
    t2 = np.linspace(-1.1.5.1000)
    cord=[(x,y)for x in t1 for y in t2]
    # print(cord)
    x_cord,y_cord=zip(*cord)
    h_val=pd.DataFrame({'x1':x_cord,'x2':y_cord})
    h_val['hval']=hfun2(theta, h_val['x1'],  h_val['x2'], degree)
    decision=h_val[np.abs(h_val['hval']) <2*10* * -3]
    return decision.x1,decision.x2
Copy the code

fix,ax=plt.subplots(figsize=(8.6))
x,y=find_decision_boundary(result2,6)
ax.scatter(x, y,c='y',s=10,label='Prediction')
ax.scatter(positive["Test 1"],positive["Test 2"],s=20,color="blue",marker="o",label="Accepsted")
ax.scatter(negative["Test 1"],negative["Test 2"],s=20,color="red",marker="x",label="Rejected")
ax.set_xlabel("Test 1 Score")
ax.set_ylabel("Test 2 Score")
ax.legend(loc=1)

plt.show()
Copy the code

change $\lambda$ , observe the decision curve

λ=0\lambda=0 overfitting λ=0

LearningRate=0
result3=opt.fmin_tnc(func=costReg, x0=Theta2,fprime=gradientReg,args=(X2,Y2,LearningRate))
result3
Copy the code

(array([9.11192364e+00, 1.18840465e+01, 6.30828094e+00, -8.397064e +01, -4.48639810e+01, -3.81221435e+01, -9.42525756e+01, -8.14257602e+01, -4.22413355e+01, -3.52968361e+00, 2.95734207e+02, 2.51308760e+02, 3.64155830e+02, + +02 + 1.61036970e+02 + 1.70100234e +01 + 1.71716716e+02 + 2.72109672e+02 + 3.12447535e+02 + 1.41764016e+02 + 3.22495698e+01 + -1.75836912E-01, -3.58663811e+02, -4.82161916e+02, -7.49974915e+02, -5.03764307e+02, -4.80978435e+02, -1.85566236e+02, 3.83936243 e+01]), 280, 3)Copy the code

fix,ax=plt.subplots(figsize=(8.6))
x,y=find_decision_boundary(result3,6)
ax.scatter(x, y,c='y',s=10,label='Prediction')
ax.scatter(positive["Test 1"],positive["Test 2"],s=20,color="blue",marker="o",label="Accepsted")
ax.scatter(negative["Test 1"],negative["Test 2"],s=20,color="red",marker="x",label="Rejected")
ax.set_xlabel("Test 1 Score")
ax.set_ylabel("Test 2 Score")
ax.legend(loc=1)

plt.show()
Copy the code

λ=100\lambda=100 less fitting λ=100

LearningRate=100
result4=opt.fmin_tnc(func=costReg, x0=Theta2,fprime=gradientReg,args=(X2,Y2,LearningRate))
result4
Copy the code

Array ([0.05021733, 0.03612558, 0.06132196, -0.09533284, -0.05178218, -0.05997038, 0.01781905, -0.02129793, -0.02232718, 0.00143167, -0.0641887, -0.00610741, -0.0331259, -0.01567088, -0.04668579, 0.00277198, -0.01173558, -0.00506652, -0.014897, -0.01560257, -0.00804795, -0.0391633, 0.0002515, -0.0162331, -0.00124322, -0.01696057, 0.00682618, 0.02907843]), 93, 4)Copy the code

fix,ax=plt.subplots(figsize=(8.6))
x,y=find_decision_boundary(result4,6)
ax.scatter(x, y,c='y',s=10,label='Prediction')
ax.scatter(positive["Test 1"],positive["Test 2"],s=20,color="blue",marker="o",label="Accepsted")
ax.scatter(negative["Test 1"],negative["Test 2"],s=20,color="red",marker="x",label="Rejected")
ax.set_xlabel("Test 1 Score")
ax.set_ylabel("Test 2 Score")
ax.legend(loc=1)

plt.show()
Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Machine Learning assignment 2: Logistic Regression (Python implementation)

Logistic regression

$The sigmoid function$

$gradient \ descent$ (Gradient descent)

Regularized logistic regression

$regularized cost$ (Regularized cost function)

change $\lambda$ , observe the decision curve

Machine Learning assignment 2: Logistic Regression (Python implementation)

Logistic regression

s i g m o i d function The sigmoid function sigmoidletterThe number

g r a d i e n t d e s c e n t gradient \ descent gradient descent(Gradient descent)

Regularized logistic regression

r e g u l a r i z e d c o s t regularized cost regularizedcost(Regularized cost function)

change Lambda. \lambda Lambda., observe the decision curve

Related Posts

TensorFlow Developer Summit: Swift support, better JavaScript support

CLIP: Training a unified vector embedding of images and text

Design and implementation of TensorFlow model quasi-real-time update on-line

$The sigmoid function$

$gradient \ descent$ (Gradient descent)

$regularized cost$ (Regularized cost function)

change $\lambda$ , observe the decision curve