Currently, there are three common integrated learning frameworks: Bagging, Boosting, and Stacking.Copy the code

1/ Bagging integrated learning ideas

Sampling in the total training set, extracting a sub-training set for each base model, and then training the model. Comprehensive judgment is made on the results predicted by all base models (voting idea, minority follows majority), and the final prediction result is finally produced. Random Forest algorithm is a typical Bagging integrated learning framework algorithm. Each decision tree in the random forest is an expert in a narrow field. Each expert gives a prediction, and the results of all the experts are combined into a final result. One expert can be wrong, but the probability that all experts will be wrong is very small, and this is the core idea of a random forest. And the KNN algorithm, the same idea.Copy the code

2/boosting

The training process is stepped, and the base models are trained one by one (parallel in implementation), and the training set of the base model is transformed according to a certain strategy every time. A linear synthesis of the predicted results of all the base models produces the final prediction result. Perform multiple fitting, each time fitting the previous error, and finally add up all the fitting results. GBDT, XGBoost and AdBoost algorithms are all typical Boosting integrated learning algorithms.Copy the code

3/stacking

All the trained base models are used to predict the training set data. The predicted value of the JTH base model for the ith training data will be the JTH eigenvalue of the ith sample in the new training set. Finally, training will be conducted based on the new training set. Similarly, in the process of prediction, a new test set should be formed after the prediction of all the base models, and then the test set is predicted.Copy the code

With these basic concepts in mind, intuition tells us that since there is no longer a single model to predict, the model has the ability to "pool its ideas", which makes it less likely to over-fit and more accurate to predict.Copy the code