What is zero mean?

Zero mean

In deep learning, we generally preprocess the training images fed to the network model, and the most commonly used method is zero-mean-centralization, even if the pixel value range is changed to [-128,127], centered on 0.

role

The advantage of this is to speed up the convergence of weight parameters of each layer in the network in back propagation.

The z-type update can be avoided and the convergence speed of neural network can be accelerated. Sigmoid, the most classic activation function, is used to illustrate the following:

Sigmoid

It can be seen that the range of ==Sigmoid and its derivative are always greater than 0==. Review the formula w’= W − R ∗ DWW ‘= W-R * DWW ‘= W − R ∗dw.

For example, in the CV problem, we enter pixel values between (0, 255). That is, when x is all positive or all negative, the gradient returned each time will change in only one direction, that is, the gradient will change in the direction indicated by the red arrow, too much up one moment, too much down the next. In this way, the convergence efficiency of weights is very low.

Zero mean

Processing method: Subtract the value of each pixel from the mean value of all pixels in the training set, for example, the calculated mean value of all pixels is 128, so the pixel range is [-128,127] after 128, that is to say, the mean value is zero.

The difference between zero mean and normalization can be visualized by a graph:

Reference article: blog.csdn.net/qq_41452267…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

What is zero mean? What is zero mean?

What is zero mean?

Zero mean

role

Sigmoid

Zero mean

What is zero mean? What is zero mean?

What is zero mean?

Zero mean

role

Sigmoid

Zero mean

Related Posts

How BERT and GAN compress, and see our PaddleSlim new weapon — OFA

Datawhale Study Notes — Logistical regression

Horovod (13), a distributed training framework for deep learning, is a Driver for elastic training