In the daily workflow of developing machine learning models in the company, I often find that people can’t wait to call interfaces when they get the data, write code in TensorFlow or pyTorch, and only find that the accuracy rate is less than half of the target after hundreds of lines of code are written and executed.

It’s very common for developers to think that all they need to do to implement their models is write the code logic, and then just write the data logic.

Sometimes THEY even get it right, which gives them confidence. Sometimes I tell them I don’t like them. However, there is little mining and analysis of the data itself, so the development of the model mostly forms a metaphysics. (Although there is some metaphysics in it, most of it has logic to follow)

If we want to improve the accuracy of the model, there are generally the following methods:

1. Increase the model scale

This approach is very obvious when weak model becomes strong model through ensemble learning. Decision trees and random forests, for example, combine with less accurate classifiers to produce more accurate results, like two heads are better than one. Neural network model is also reflected in this way, because deep learning is almost a function that can fit any data set. If the number of neurons increases, it can better fit the data set. Of course, if the variance increases, the increase of variance can be offset by regularization.

Second, modify the model architecture

For integrated learning, there are Bagging, Boosting, and Stacking. In the integration of tree model using these methods, different model frameworks will have different methods for data sets. In general, Bagging will significantly reduce the bias and slightly reduce the variance, while Boosting is more targeted at variance and will reduce the difference more greatly, so Boosting will have a higher general accuracy.

In neural networks, different architectures have a greater impact on the results, so specific adjustments and adaptations should be made to the data set.

Reduce features and add more training data

Adding more training data can reduce variance, because more data means more stable data distribution, but it does not reduce bias. The significance of reducing features is that the stronger the correlation between features, the greater the noise caused by features, and reducing meaningless features can obviously reduce the overall variance of the model.

Reduce or delete regularization

Regularization (L1, L2, Dropout) aims to keep the model from overfitting the data set. However, if the training set is too small or the model iterations are insufficient, the model cannot iterate to a better local minimum. In this case, partial regularization should be reduced to improve the fitting effect, reduce the deviation, but also increase the variance.