The article comes from my official account and blog of the same name ~ welcome to follow

define


Based on the generalizing example of the last article, we have a basic concept of convolution, but what is a more rigorous definition?

Mathematically, convolution is just an operation, and for those of you who haven’t studied signal processing, automatic control you don’t need to know all the terminology, but let’s move on. Essentially convolution is a function of two variables rolled into a function of one variable, commonly known as dimensionality reduction strike.


1. How to roll it?

Consider that f and g should be equal, or variables

x

y

Roll up


2. What’s the use of it?

Can be used for multi-digit multiplication, such as:

Notice that the sequence (14,34,14,4) of each of the coefficients in parentheses to the right of the second equals sign is actually the convolution of the sequence (2,4) and (7,3,1). This is a little bit of a pain when the multipliers are small, but if you want to compute the product of two very, very long numbers, this approach comes in handy, because you can use the fast Fourier transform FFT to get the convolution, which is much faster than the hard multiplication in the example

There is a less precise understanding:


X

a

A corresponds to theta with a different frequency

exp(ikt)


The kernel of convolution (involving derivation, can be skipped)


The first thing we need to understand is that the inner product, the integral, and the projection are in some sense the same thing.

Define a set of vectorsAnother set of vectors

Then the inner product can be expressed as:

See, this is both an inner product and an integral. The concept of projection can be understood as a set of projections of vector A onto the basis vector b, with coordinates of

This is the same thing as the three-axis projection coordinates of a point in 3D Euclidean space.

By the way, we can see what the Fourier transform is doing:


Let me introduce a perfect formula, Euler’s formula:

Fourier is the integration of f(t) times e (-jwt) over an infinite domain, so f(t) is projected onto e (-jwt), and if you don’t understand what that means, it transforms into two orthogonal trig functions.

So this is clear: Fourier projects F (t) onto two sine and cosine Spaces that are orthogonal. This projection relation can also be seen more easily from the Fourier series decomposition expression of the periodic signal.

This projection is a little bit strange, it takes g of

T

It is equivalent to flipping the input signal R (t) 180° on the time axis before projection, and then projecting it with the system F (t). The concept of projection is well understood. Whether the inner product of vectors is equivalent to a linear projection, or the projection plane of a polyhedron of space on a three-dimensional space plane, the projection operation is equivalent to a coincident area.

If we look at the relationship between input, system, and output from this perspective, we can graphically understand why a first-order system is a monotonically ascending curve under the output of a step response. Here’s a graphical explanation of convolution from Wikipedia, and if you want to learn more about it, check out the textbook

(Thanks for the help of Wang Nimo, an excellent student of Zhihu)



Application of convolution



Convolved with a template and an image, for a point on the image, the origin of the template coincides with that point, and then the points on the template are multiplied by the corresponding points on the image, and then the product of each point is added to obtain the convolution value of that point. I’m going to do that for every point on the graph.


Because most templates are symmetrical, they do not rotate. Convolution is an integral operation to find the area where two curves overlap. Can be regarded as weighted summation, can be used to eliminate noise, feature enhancement. Replace the pixel value of a point with a weighted average of the pixel values of the points around it.


Convolution can also be understood as a linear operation. Mask operation commonly used in image processing is convolution, which is widely used in image filtering. The most important case of convolution is the convolution theorem in signal and linear system or digital signal processing.

By using this theorem, convolution operation in time domain or space domain can be equivalent to multiplication operation in frequency domain, so fast algorithm such as FFT can be used to realize effective calculation and save operation cost.

Here is an example from the Sselssbh blog that illustrates the power of convolution in the image domain


Here’s an image, and as you can see, there’s a lot of noise:

A high frequency signal, like a flat mountain:


It looks very conspicuous. One way to smooth the mountain is to take some of the earth off the top and put it around it. In mathematical terms, the height around the mountain is smoothed down


Convolution can implement this smoothing algorithm. The original image with noise can be converted into a matrix:

Then smooth the image with the following average matrix (for clarification, the original image is actually treated with a normal distribution matrix, but the arithmetic average matrix is used here for simplicity) :

Remember the algorithm, averaging the high frequency signal with the surrounding values smoothes the peaks. Let’s say I want to smooth a1,1, in the matrix, take the points near a1,1 and form the matrix F, convolve with g, and fill it back in

Note that in order to apply convolution, g is in the same dimension as f, but with a slightly different subscript:

The convolution formula is:

It’s equivalent to realizing CVD of the G matrix on the original image (to be precise, the following 2-D convolution example rotates the CVD matrix 180∘)


2 d convolution


In image recognition, 2-dimensional convolution is one of the most common and fairly simple operations: start with the convolution kernel, which is a small weight matrix. The convolution kernel “slides” over the 2-dimensional input data, matrix multiplies the partial elements of the current input, and then assembles the result into a single output pixel.

From a three-dimensional perspective, convolution in two dimensions is movement and mapping.


Expansion from the plane is the following process, each blue number of 9 cells corresponds to a corresponding green number, and this corresponding process is the specific calculation of convolution.


Another example is making steamed buns


Downstairs breakfast shop business is too good, demand exceeds supply, bought a machine, continuous production of steamed bread.

Assuming that the production speed of steamed bread is F (t), the total amount of steamed bread produced after one day is:

After steamed bread is produced, it will slowly rot. Suppose the corruption function is G (t), for example, 10 steamed bread will rot in 24 hours:

Just think about it, the steamed bread produced in the first hour will experience 24 hours of corruption a day later, and the steamed bread produced in the second hour will experience 23 hours of corruption a day later. In this way, we can know that a day later, the steamed bread is corrupt altogether: