The principle of fast Gaussian fuzzy algorithm and webGL engineering implementation

In daily life, Gaussian blur is very common, compared with Mosaic, Gaussian blur gives people a more natural feeling. Gaussian blur, as its name implies, is the sampling of convolution (mask) subject to Gaussian distribution. This paper mainly discusses how to carry out Gaussian blur on a picture, and how to reduce the calculation of gaussian blur algorithm. In addition, in the front-end image rendering, webGL can make full use of GPU for calculation and rendering, which can accelerate the effect. This paper will also introduce how to achieve fast Gaussian blur using WebGL on the basis of fast Gaussian blur algorithm.

Filtering and convolution (mask)

Gaussian blur

Fast Gaussian blur principle

Filtering and convolution

Firstly, I will introduce filtering and convolution. Filtering is a concept in signal processing. For a signal, it can be composed of many waves of different frequencies.

In simple terms, filtering is: from the signal to get the specified wavelength of the wave

(1) Image time domain and frequency domain

So what is an image or what is filtering in an image, first of all we need to understand that filtering is actually a concept of the frequency domain, the frequency domain just means frequency, so what is the frequency of an image?

The time domain

To understand the frequency domain of an image let’s first introduce the time domain of an image.

A digital image can be defined as a two-dimensional function f(x,y), where X and y are spatial (plane) coordinates, and the amplitude f of any pair of spatial coordinates (x,y) is called the grayscale or intensity of the image at that point.

The grayscale, or color, that we see with our eyes in the above picture is the time domain, which means that the color of a pixel, as we normally call it, is observed from the time domain.

Frequency domain

The frequency of an image is an indicator of the intensity of gray changes in the image, and is the gradient of gray in the plane space. For example, a large area of desert in the image is a region with slow gray changes and the corresponding frequency value is very low. In contrast, the edge area with drastic change of surface attributes is a region with drastic change of gray level in the image, and the corresponding frequency value is high. In simple terms: * frequency domain is the grayscale or color transformation trend (gradient) *.

Similarly, in the picture above, we use RGB color to draw the color of each point in the picture. The following figure can be obtained:

The points with high color fluctuation are the places with high frequency, and the places with no color fluctuation have low frequency.

(2) Convolution (mask)

We can transform the time domain into the frequency domain by Fourier transform, and the convolution in the time domain is equal to the product of the frequencies. If filtering can be done in the frequency domain, then you just do the product. However, the conversion from time domain to frequency domain is not very intuitive and relatively complex. Therefore, in engineering, most of the image filtering is concentrated in the time domain, that is, the convolution of color values.

We convolve the original image with a window, and the resulting new image is the filtered image, and this window is called the mask. The value of each element of this window is generally related to the filtering function. We then illustrate the filtering process with examples:

The above is a simple raw image, the values above the image represent the gray value, we now assume a mask.

The gray level of the image is most convolved with the window. Here, for convenience of calculation, only one highlighted pixel is calculated, whose original pixel value is 98. The filtered image obtained after convolution calculation is shown as follows:

We see that pixel 98 has become 82 gray. This is a pixel, all pixels of the picture are convolved with the window, and the new gray level obtained is the new picture after filtering.

Note: Of course, the window values must be normalized to ensure that the grayscale or color value range does not exceed the threshold.

Gaussian blur

(1) What is Gaussian blur

Blurring is a special kind of filtering through which the image becomes unclear. We know that filtering = the convolution of the original image and the mask. When the mask (window) obeys gaussian distribution, we call this filtering Gaussian filtering, also known as Gaussian blur.

First, let’s look at the one-dimensional Gaussian distribution:

The probability image is:

In the image, we are in a 2D coordinate system, so we should obey a 2D probability density function:

In the two-dimensional Gaussian distribution function:

X represents a two-dimensional coordinate (x,y).
The μ is the center point and the general assumption is (0,0).

When calculating the mask, since the essence of gaussian blur is to estimate the center point with the surrounding points, we set the coordinate of the center point to be estimated as (0,0). The values of N points around it are calculated by gaussian distribution. I get the window function for filtering. We take the calculation of the 3×3 filtering window as an example: we take a pixel point as the origin, and then sample 8 points around it, altogether 9 points.

By normalizing these 9 points, a complete filtering window (mask) can be obtained:

This is the final mask or filtering window, and if you convolve every pixel of the original image with the mask, you get a fuzzy image, and this process is called Gaussian blur or Gaussian filtering.

Generally speaking, Gaussian blur is not the same thing as Gaussian distribution. When the filtering window obeises gaussian distribution, the filtering is called Gaussian blur before the mathematical knowledge is told for a while. Then we look at how to realize and optimize Gaussian blur in engineering.

(2) The implementation of Gaussian blur

It can be seen from the formula of two-dimensional Gaussian blur that, for Gaussian blur, only one variable is blur radius and variance when calculating the window mask. We’re going to set the variance to 1, and the fuzzy radius R determines how many points are sampled, and in the previous example the fuzzy radius was 3, so we sampled 9 points. In two dimensions, the fuzzy radius (that is, how many points are taken around) can be:

Radius of the sampling	Number of sampling points
3	9
5	25
7	49
9	81
11	121
13	169

Let’s take 9-point sampling with radius 3 as an example to see the Code implementation of Gaussian blur in WebGL:

vec4 orColor=texture2D(u_image,v_texCoord); Const float PI = 3.1415926535897932384626433832795; float w = 1./degree; float temWeight[11]; TemWeight [0] = (1.0/(2.0 * PI * w * w)); TemWeight [1] = (1.0 / (2.0 * PI * w * w)) * exp (1 * w * w); TemWeight [2] = (1.0 / (2.0 * PI * w * w)) * exp (- 2 * w * w); float total; total = temWeight[0] + temWeight[1] * 4. + temWeight[2] * 4. float orAlpha=orColor.a; float weight[3]; weight[0] = temWeight[0]/total; weight[1] = temWeight[1]/total; weight[2] = temWeight[2]/total; Vec4 color= texture2D(u_image,v_texCoord + tex_offset * vec2(0,0)) * weight[0] + texture2D(u_image,v_texCoord + Tex_offset * vec2(0,-1)) * weight[1] + texture2D(u_image,v_texCoord + tex_offset * vec2(1,0)) * weight[1] + Exture2D (u_image,v_texCoord + tex_offset * vec2(-1,0)) * weight[1] + texture2D(u_image,v_texCoord + tex_offset * vec2( 1, 0)) * weight[1] + texture2D(u_image,v_texCoord + tex_offset * vec2( 1, 1)) * weight[2] + texture2D(u_image,v_texCoord + tex_offset * vec2(-1, 1)) * weight[2] + texture2D(u_image,v_texCoord + tex_offset * vec2( 1, -1)) * weight[2] + texture2D(u_image,v_texCoord + tex_offset * vec2( -1, -1)) * weight[2] gl_FragColor=vec4(color.rgb,orAlpha);Copy the code

The above is webGL version of the 9 sampling Gaussian blur implementation, the shader written above, you can do gaussian blur on a picture. The effect picture is as follows:

When sampling at 9 points, for each pixel in the picture, 9 times of multiplication and 8 times of addition are done (excluding the part of calculating total and normalized weight, which is the general part). So for a 400,300-pixel image, a total of 400,300 (8+9) calculations are needed, nearly 200,000 times *. Although we use WebGL to improve the computing efficiency by using GPU, such a huge amount of computation is unacceptable in some scenes, especially in animations with high real-time performance or FPS, so we need to reduce the amount of computation through certain methods.

Fast Gaussian blur principle

The computation reduction of Gaussian blur basically revolves around the following two assumptions:

Two-dimensional Gaussian blur can be regarded as the product of two one-dimensional Gaussian blur
The Gaussian blur with standard deviation 2σ is equal to the sum of the two Gaussian blur with standard deviation σ

(1) Yang Hui triangle (binomial coefficient) to evaluate instead of weight

We used the two-dimensional Gaussian distribution to calculate the weight before, and the calculation process can be simplified to the binomial coefficient. The principle is:

The discrete Gaussian distribution can be approximated by the value of the binomial coefficient of the binomial distribution.

This is a picture of a binomial coefficient, and similarly, when sampling at 9 points with a radius of 3, we take N = R+ 1 = 4. So the binomial coefficients are 1, 4, 6, 4, 1. You don’t need to calculate the coefficients of the Gaussian distribution to find the weights.

(2) The two-dimensional Gaussian filter is divided into two one-dimensional Gaussian filters

We can split the two-dimensional Gaussian filter into two one-dimensional Gaussian filters. Also for gaussian blur with radius 3, instead of sampling 9 points, we need to sample 5 points.

We compute one-dimensional Gaussian filters in the x and y directions respectively, so that 9 multiplications and 8 additions become 5 multiplications and 4 additions, almost cutting the computation in half.

(3) Linear filtering further reduces the calculation

Taking 25-point sampling with radius R as an example, we split two-dimensional Gaussian blur into two one-dimensional Gaussian blur, and the code is as follows:

uniform sampler2D image; out vec4 FragmentColor; Uniform float offset[5] = float[](0.0, 1.0, 2.0, 3.0, 4.0); uniform float offset[5] = float[](0.0, 1.0, 2.0, 3.0, 4.0); Uniform float weight[5] = float[](0.2270270270, 0.1945945946, 0.1216216216, 0.0540540541, 0.0162162162); Void main(void) {FragmentColor = texture2D(image, vec2(gl_FragCoord) / 1024.0) * weight[0]; void main(void) {FragmentColor = texture2D(image, vec2(gl_FragCoord) / 1024.0) * weight[0]; for (int i=1; i<5; I ++) {FragmentColor += texture2D(image, (vec2(gl_FragCoord) + vec2(0.0, offset[I])) / 1024.0) * weight[I]; FragmentColor += texture2D(image, (vec2(gl_FragCoord) - vec2(0.0, offset[I])) / 1024.0) * weight[I]; }}Copy the code

Because it’s a sample of a radius of 5,25 points, you have to do 9 multiplications and 8 additions after dimensionality reduction. Now we can use linear filtering to represent two points with one point.

Linear filtering and linear interpolation are common methods in mathematics. Let’s briefly introduce linear interpolation:

In the figure above, we have x0,y0 and x1,y1. So if we know x, how do we solve for y, we’re going to use linear interpolation.

y = y0 + [(y1-y0)/(x1-x0)] * x

In other words, we can get a new point by using a linear difference of two points. Linear interpolation is widely used in images such as

For the above multicolor square, except for the four colors of the vertex, the colors of the other points are obtained by bilinear interpolation.

We can get a new point from two points, and that’s linear interpolation. This new point can represent the original two points.

In Gaussian blur, we can use the above formula to express the weight and distance of two points by one point. By this formula, our sampling of radius R = 5 at 25 points can be further simplified. We go from linear with five different weights to three.

uniform sampler2D image; out vec4 FragmentColor; Uniform float offset[3] = float[](0.0, 1.3846153846, 3.2307692308); uniform float offset[3] = float[](0.0, 1.3846153846, 3.2307692308); Uniform float weight[3] = float[](0.2270270270, 0.3162162162, 0.0702702703); Void main(void) {FragmentColor = texture2D(image, vec2(gl_FragCoord) / 1024.0) * weight[0]; void main(void) {FragmentColor = texture2D(image, vec2(gl_FragCoord) / 1024.0) * weight[0]; for (int i=1; i<3; I ++) {FragmentColor += texture2D(image, (vec2(gl_FragCoord) + vec2(0.0, offset[I])) / 1024.0) * weight[I]; FragmentColor += texture2D(image, (vec2(gl_FragCoord) - vec2(0.0, offset[I])) / 1024.0) * weight[I]; }}Copy the code

That simplifies to five multiplications and four additions.

(4) Trade space for time

In addition, we can also further reduce the number of multiplication. For the radius of 3, after splitting into two one-dimensional Gaussian filters, the number of multiplication and addition is changed from 9 times and 8 times to 5 times and 4 times. We can simplify the multiplication by looking up the table.

For gaussian blur of radius 3, the coefficients of window mask are determined. This simplifies to just two different values.
For pixels, the range of pixel values is also determined, 0-255.

This allows us to create a 256 * 3 map object and store it in an array so that we don’t have to do the multiplication and we can get the value behind the mask. So instead of doing five multiplications, you’re going to have only four additions.

(5) Finally, FBO should be carried out several times

Gaussian blur obtained by fast Gaussian filtering method may not be very good, so we can improve the effect of Gaussian blur by filtering again and repeatedly on the results of the first filtering.