Integrated learning

This is the 24th day of my participation in the First Challenge 2022

What is integrated learning? How to do that? Similarities and differences with other learning methods? Specific application scenarios? About integrated learning, I want to answer these questions first

What is integrated learning

Ensemble learning is the integration of multiple training models to draw conclusions. Each model can train different features in the data, so combining multiple models can get better training results

How to do that?

Integrated learning will fall into two categories:

Sequence integration – learner generates in sequence – dependency – error samples with higher weight to improve the overall prediction effect
Parallel integration – Parallel generation – independent average error reduction

Common methods

bagging

Bagging principle

There are put back sampling – K model votes to get classification results – can be parallel

Bagging application

Random forest = Bagging + decision tree

boosting

Weak classifiers are assembled into strong classifiers – only in order of problems to be solved

How does the data weight and probability distribution change in each round?
What combination method is adopted – addition model [AdaBoost (equal-score at first, with larger weight given to failure cases after training), GBDT (reduce the last residual each time)]

Boosting application

AdaBoost + Decision tree = Boosting tree Gradient Boosting + Decision tree = GBDT – GBDT improvement – GBT

stacking

Training the output integration learning of multiple models mainly involves two layers: the first layer is training the data of complex models. The second layer is to output the results of the data after training

Integrated learning interview questions

What is an ensemble learning algorithm?

Integration of multiple weak classification models – strong classification

What are the main frameworks of integrated learning?

Bagging, Boosting, stacking

A brief introduction to Bagging. What are the common bagging algorithms?

Multiple sampling, evenly divided weight, group voting random forest

Boosting is a boosting algorithm. In parallel, the input of the latter classifier depends on the residual of the former classifier

Adaboost, GBDT – XGBoost

What is the mathematical expression of Boosting idea?

Linear combinations of basis functions

What are the commonly used stacking algorithms?

Multiple sampling, with the output as the final input feature composed of KNN, random forest and Naive Bayes, the prediction results were combined by Loqistic regression

You realize that your model suffers from low bias and high variance. Which algorithm should you use to solve the problem? Why is that?

Low bias – predicted values close to actual values – flexible enough, poor generalization – Bagging solves the first N features in high variance – regularization/variable importance charts

What are the common base classifiers?

Decision tree – Adjusts sample weights without sampling, introducing randomness nicely

Can the base classifier in random forest be replaced by decision tree with linear classifier or K-nearest neighbor? Please explain why?

No – Sample sensitive classifiers are more suitable for Bagging linear classifier/K-nearest Neighbor More stable classifiers – small variance linear classifier/K-nearest Neighbor may be more difficult to converge during training due to Bagging sampling and increase deviation

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Integrated learning – Bagging, Boosting

Integrated learning

What is integrated learning

How to do that?

Common methods

bagging

Bagging principle

Bagging application

boosting

Boosting application

stacking

Integrated learning interview questions

Integrated learning – Bagging, Boosting

Integrated learning

What is integrated learning

How to do that?

Common methods

bagging

Bagging principle

Bagging application

boosting

Boosting application

stacking

Integrated learning interview questions

Related Posts

LaTeX2021 formula preparation, graphic installation, detailed tutorial, one article to read

Analysis and retrieval of massive short video content based on deep learning

Nodejs implements apriori algorithm