Let’s look at a model of a decision tree

This is clearly a decision tree model for hiring based on job resumes

We can see that it’s a tree first and foremost

Secondly, it is a decision tree composed of multiple layers of judgment

To analyze this, let’s first assume that this is a resume screening model for front-end development engineers

First of all, let’s assume that this is a back-end guy who has self-taught himself about the front end and has a strong front end business capability. He has front-end and back-end skills in his skills list, but has no actual front-end development experience, and owns a flashy and cool blog that doesn’t work.

Looking at the tree will give approval for professional skills, no for experience, and then reject the resume. However, in real situations, if the person’s expected salary and the hard educational background required by the company are all within the appropriate range, the company may give the person an opportunity to interview.

So that brings us to the first question, is this decision tree going to have to be controlled and not overvoted.

And then let’s say that this is a really good ai engineer, and his resume comes into the front end engineer model.

His skills, front end, back end, AI, crawler, everything is very strong. But he didn’t want to be a front-end engineer at all. His job intention was to design clothes. But the model would look down and see that it was a good fit, and send him an interview, and he wouldn’t go.

Here is the second question, whether the tree structure should add a feature indicator to judge the job intention, then where should this feature indicator be put, put first, why put first. Because he’s important, and if it doesn’t fit, it can’t fit.

Now there are two questions

Is there going to be a multi-branched decision tree?

How important is the order of decisions in a decision tree?

So let’s move on with those two questions

ID3

What is ID3, it is a use of information entropy principle to select the maximum information gain of the attribute as a classification attribute, a decision tree branch, complete the construction of the decision tree

The information entropy

Oh, that’s not a word

Let’s start with entropy. Isn’t entropy used to describe the degree of chaos

So the entropy of information is a measure of how random selection of this data affects the uncertainty of the model.

A concept and formula are posted below

Information gain

For instance

Let’s say we have a list of employee resume matches

So let’s do the entropy

And then empirically calculate the information gain

And then calculate the corresponding information gain in terms of other things

When the final calculation is done, it is best to judge the salary first

Own summary

This is the model, which judgment do you make in order to get to the model of the decision tree, let’s say programmatically, you want him to make the least judgment. So let’s first look at how many different outcomes we end up with. In this model, for example, the final result is a pass or no pass. According to the data, we know that most of the people who send resumes are matched in level, education and skills, but the salary is not matched, so the model first determines the salary and is not suitable for optimal.