This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money. Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

An introduction to the

Decision stump, also called decision stump, is a simple decision tree. In the second installment, we talked about the principle of decision tree. Now we will build a single layer decision tree, which makes decisions based on a single feature. Since the tree divides only once, it is essentially a stump.

Build a simple data set

Let’s start by building a simple data set to make sure our function works.

Import numpy as np import pandas as pd # def get_Mat(path) dataSet = pd.read_table(path,header = None) xMat = np.mat(dataSet.iloc[:,:-1].values) yMat = np.mat(dataSet.iloc[:,-1].values).T return xMat,yMatCopy the code

xMat,yMat = get_Mat('simpdata.txt')
xMat
Copy the code

yMat
Copy the code

Import matplotlib. Pyplot as PLT plt.rcparams ['font. Sans-serif ']=['simhei']# show Chinese %matplotlib inline # def showPlot(xMat,yMat): Array (xMat[:,0]) Y = Np.Array (xMat[:,1]) Label = NP.Array (yMat) plT. scatter(x, Y, C =label) plt.title(' Single layer decision tree test data ') plt.show()Copy the code

showPlot(xMat,yMat)
Copy the code

3. Construct a single-layer decision tree

We will build two functions to implement our single-layer decision tree: the first tests whether any value is less than or greater than the threshold we are testing. The second function is slightly more complex, loops through a weighted data set and finds the single-layer decision tree pseudocode with the lowest error rate as follows:

Set the minimum error rate minE as +∞
For each feature in the dataset (layer 1 loop) :
For each step size (layer 2 loop) :
For each inequality sign (layer 3 loop) :
A single-layer decision tree is established and predicted by weighted data set
If the error rate is lower than minE, the current single-layer decision tree is set as the best single-layer decision tree
Returns the optimal single-layer decision tree

Def Classify0(xMat, I,Q,S): re = np.ones((xMat. Shape [0],1)) # if S == 'lt': Re [xMat[:, I] <= Q] = -1 # If less than threshold, assign -1 else: re[xMat[:, I] > Q] = -1 # If greater than threshold, assign -1 return reCopy the code

def get_Stump(xMat,yMat,D): M,n = xmat. shape #m is the number of samples Steps = 10 # Initialize a step bestStump = {} # store the stump information in dictionary form bestClas = np.mat(np.zeros((m,1))) # initialize the classification result as 1 minE = Np.INF For I in range(n): Min = xMat[:, I].min() # step = (max-min)/Steps # step = (max-min Range (-1, int(Steps)+1): for S in ['lt', 'gt']: # Lt: less than, Gt :greater than Q = (Min + j * stepSize) # Calculate threshold re = Classify0(xMat, I, Q, Err = np.mat(np.ones((m,1))) # err[re == yMat] = 0 # eca = D.T * err # {I}, threshold: {np. Round (Q, 2)}, mark: {S}, weighted error: {np. Round (eca, 3)} ') if the eca < mime: MinE = eca bestClas = re.copy() bestStump[' feature column '] = I bestStump[' threshold '] = Q bestStump[' flag '] = S return bestStump,minE,bestClasCopy the code

Shape [0] D = np.mat(np.ones((m, 1))/m) #Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Machine learning Adaboost code implementation

An introduction to the

Build a simple data set

3. Construct a single-layer decision tree

Machine learning Adaboost code implementation

An introduction to the

Build a simple data set

3. Construct a single-layer decision tree

Related Posts

Deep Learning Selected Interview Questions: Covering all Deep Learning Topics (15-30)

2021 Hangzhou computing Conference is coming! Admission booking is free!

Simple method to save and load TensorFlow model parameters (CKPT method)