What is zero mean?

Zero mean

In deep learning, we generally preprocess the training images fed to the network model, and the most commonly used method is zero-mean-centralization, even if the pixel value range is changed to [-128,127], centered on 0.


The advantage of this is to speed up the convergence of weight parameters of each layer in the network in back propagation.

The z-type update can be avoided and the convergence speed of neural network can be accelerated. Sigmoid, the most classic activation function, is used to illustrate the following:


It can be seen that the range of ==Sigmoid and its derivative are always greater than 0==. Review the formula w’= W − R ∗ DWW ‘= W-R * DWW ‘= W − R ∗dw.

For example, in the CV problem, we enter pixel values between (0, 255). That is, when x is all positive or all negative, the gradient returned each time will change in only one direction, that is, the gradient will change in the direction indicated by the red arrow, too much up one moment, too much down the next. In this way, the convergence efficiency of weights is very low.

Zero mean

Processing method: Subtract the value of each pixel from the mean value of all pixels in the training set, for example, the calculated mean value of all pixels is 128, so the pixel range is [-128,127] after 128, that is to say, the mean value is zero.

The difference between zero mean and normalization can be visualized by a graph:

Reference article: blog.csdn.net/qq_41452267…