Computer Vision – Traditional image processing

Image filtering

Image filtering: suppress the noise of the target image while preserving the details of the image as much as possible.

Smoothing: also known as blur, is a simple and frequently used image processing method. One function of smoothing is to attenuate noise.

1. List common linear filters

A low-pass filter allows low frequencies through

A high-pass filter allows high frequencies to pass through

Bandpass filters allow a certain range of frequencies to pass through

A band stop filter allows a range of frequencies to pass through and prevents others from passing through

An all-pass filter allows all frequencies to pass through, changing only the phase

Notch filters prevent a narrow frequency range from passing through

2. Linear filtering and nonlinear filtering

Linear filtering: box filtering Mean filtering Gaussian filtering Nonlinear filtering: median filtering bilateral filtering

(1) Boxblur filtering (boxblur function) : the average value of each output pixel is the pixel value of the kernel neighborhood

(2) Mean filtering (Blur function) : Mean filtering is actually to replace each pixel value in the original image with the mean.

Average each pixel with the surrounding 8 pixels

Principle: In an image, the pixel to be treated is given a template that includes adjacent pixels around it. A method that replaces the original pixel value with the mean of all pixels in the template.

Method: For the current pixel being processed, select a template consisting of several pixels in its nearest neighbor,

The method of replacing the original pixel value with the mean of the pixels in the template.

Advantages: The average operation of each pixel with the surrounding 8 pixels, smooth image speed, simple algorithm.

Disadvantages: 1, in reducing noise at the same time, make the image blurred, especially the edge and details, and the larger the fuzzy size, the greater the degree of image blur. 2. The smoothing effect of salt and pepper noise is not ideal. (Noise cannot be removed)

Can not protect the image details, in the image denoising at the same time also destroyed the details of the image, so that the image becomes blurred, can not remove the noise point very well.

③ Gauss filter

Gaussian filter is a linear smoothing filter, suitable for eliminating gaussian noise, widely used in image processing noise reduction process.

Gaussian filtering is the process of weighted average of the whole image. The value of each pixel is obtained by weighted average of its own and other pixel values in the neighborhood.

The specific operation of Gaussian filtering is to use a template (or convolution, mask) to scan every pixel in the image, and replace the value of the central pixel of the template with the weighted average gray value of the pixels in the neighborhood determined by the template.

Gaussian smoothing filter is very effective in suppressing noises that follow normal distribution.

④ Median filter

Median filtering is a typical nonlinear filtering technology. The basic idea is to replace the gray value of the pixel with the median gray value of its neighborhood. This method can not only remove the pulse noise and salt and pepper noise, but also retain the image edge details.

⑤ Bilateral filtering

Bilateral filter, a nonlinear filtering method, is a compromise processing combining spatial proximity and pixel value similarity of images, and considering spatial information and gray level similarity at the same time, so as to achieve the purpose of edge preservation and denoising. It is simple, non-iterative and partial.

Used on edge information important to retain image denoising. The disadvantage is that because the bilateral filter guarantees the edge information, it saves too much high frequency information. For the high frequency noise in the color image, the bilateral filter can not filter cleanly, but can filter the low frequency information well.

⑥ Wiener filtering

Wiener filter is an adaptive least mean square error filter. Wiener filtering is a statistical method. Its optimal criterion is based on the correlation matrix of image and noise respectively. It can adjust the output of filter according to the local variance of image.

** salt and pepper noise with what filter? ** Median filtering

Edge detection

The purpose of edge detection is to find the set of pixels in the image whose brightness changes sharply, which is often a contour. If the edge of the image can be accurately measured and positioned, it means that the actual object can be positioned and measured, including the area of the object, the diameter of the object, the shape of the object, etc.

What are the edge detection operators:

First order: Roberts Cross operator, Prewitt operator, Sobel operator, Canny operator, compass operator

Second order: The Laplacian operator, Marr-Hildreth, crosses zero of the second derivative in the gradient direction.

1. Introduce canny edge detection

Canny edge detection is a very popular edge detection algorithm, which was proposed by John Canny in 1986. It is a multi-stage algorithm, that is, composed of multiple steps.

1. Image denoising 2. Image gradient calculation 3. Non-maximum suppression 4

First, the image is denoised. We know that the gradient operator can be used to enhance the image, essentially by enhancing the edge contour, that is, edge can be detected. However, they are greatly affected by noise. So, the first step is to remove the noise, because noise is the place where the gray scale changes greatly, so it is easy to be identified as false edge.

The second step is to calculate the image gradient and get the possible edges. We have introduced in the previous article “Image gradient” that the edge of the image can be obtained by calculating the image gradient, because the gradient is the place where the gray level changes obviously, and the edge is also the place where the gray level changes obviously. Of course, this step only takes you to the edge of the possible. Because the gray level may or may not change at the edge. So this is the set of all possible edges.

The third step is non-maximum suppression. Generally, the places where the gray scale changes are concentrated. In the gradient direction within the local range, the gray scale with the largest change is retained, while the others are not, so that a large number of points can be removed. Turn an edge that has multiple pixel widths into a single pixel wide edge. The “fat edge” becomes the “thin edge.”

The fourth step is double threshold screening. After non-maximum suppression, there are still many possible edge points, and a double threshold is further set, namely, low threshold (LOW) and high threshold (high). If the grayscale change is greater than high, set it as strong edge pixel; if it is lower than low, delete it. The Settings between low and high are weak edges. Further judgment, if there are strong edge pixels in the field, retain, if not, remove.

The purpose of this is to keep only the strong edge contour, some edges may not be closed, need to supplement from the point between low and high to make the edge as closed as possible.

What does the Canny operator do? Describe the calculation procedure of Canny operator

(1) Transform color images into grayscale images; (2) Using gaussian filter to smooth the image; ③ Calculate the amplitude and direction of image gradient; (4) Non-maximum suppression of gradient amplitude; ⑤ Use double thresholds to detect and connect edges; The Canny operator uses a hysteresis threshold, which requires two thresholds (high threshold and low threshold). If the amplitude of a pixel position exceeds a high threshold, the pixel is retained as an edge pixel. If the amplitude of a pixel position is less than the low threshold, the pixel is excluded. If the amplitude of a pixel position is between two thresholds, the pixel is only retained if it is connected to a pixel above the high threshold.

2. Briefly describe the Sobel operator

Sobel operator is a discrete differentiation operator mainly used for edge detection. The Sobel operator combines gaussian smoothing and differential derivation to calculate the approximate gradient of image gray function. Using this operator at any point in the image will produce the corresponding gradient vector or its normal vector.

When the kernel size is 3, our Sobel kernel can produce significant errors (after all, the Sobel operator is just approximating the derivative). To solve this problem, OpenCV provides a Scharr function, but this function only works on kernels of size 3. This function is just as fast as the Sobel function, but the results are more accurate.

3. Briefly describe the general steps of edge detection in traditional algorithms

Filtering: filtering to remove noise; (2) enhance: enhance the features of the edge; ③ The edge is extracted in some way to complete edge detection.

4. How to find the edge, 45° edge

Sobel operator to achieve horizontal edge detection, vertical edge detection; 45 degrees, 135 degrees Angle edge detection.

5. SIFT

Scale-invariant feature Transform (SIFT) is a method for detecting, describing and matching local feature points of images in computer vision. By detecting extreme points or feature points (Conrner points, Interest points) in different scale Spaces, the position, scale and rotation invariants are extracted, and feature descriptors are generated, which are finally used for image feature Point matching.

How does SIFT feature maintain rotation invariance?

Sift features maintain rotation invariability by rotating the coordinate axis to the main direction of the key point, which is obtained by counting the maximum value of the histogram of the direction distribution of pixel gradient in the local neighborhood of the key point

SIFT feature matching

The detected feature points in two images, the Euclidean distance of feature vector can be used as the similarity measure, feature points 1 a certain key points in the image, and found in image 2 instead of the nearest two key points, if recent distance and time close the ratio is less than a certain threshold, argues that the pair of key points for the nearest matching points. If the scale threshold is reduced, the number of SIFT matching points will be reduced, but relatively more stable. The threshold ratio ranges from 0.4 to 0.6.

SIFT features

SIFT is an algorithm for detecting, describing and matching local feature points of images. It detects extreme points in scale space, extracts position, scale and rotation invariants, and abstracts them into feature vectors to describe them. Finally, it is used for matching image feature points. SIFT feature is invariant to gray scale, contrast transformation, rotation, scale scaling, etc., and has certain robustness to Angle of view change, affine change and noise. However, its real-time performance is not high, and it cannot accurately extract feature points from targets with smooth edges.

6.SURF feature matching

Speed Up Robust Feature (SURF) is similar to SIFT Feature, which is also a Feature descriptor used to detect, describe and match local Feature points of images. SIFT is a widely used feature point extraction algorithm, but its real-time performance is poor, without the help of hardware acceleration and special graphics processor (GPUs), it is difficult to achieve the real-time requirements. For some real-time application scenarios, such as real-time target tracking system based on feature point matching, processing tens of frames of images per second, need to complete feature point search and positioning, feature vector generation, feature vector matching and target locking at the millisecond level, SIFT feature is difficult to meet this demand. SURF uses the idea of SIFT approximation simplification (DoG approximation to replace LoG) to simplify the Gaussian second-order differential template of Hessian matrix. With the help of integral graph, the image filtering of template only requires several simple addition and subtraction operations, and such operations are independent of the size of filter template. SURF is equivalent to the accelerated version of SIFT, which improves the computing speed under the condition of similar performance of feature point detection. On the whole, SUFR is several times faster than SIFT and has better comprehensive performance.

7. LBP features

Local Binary Patter (LBP) is an operator used to describe Local texture features of images. LBP features have significant advantages such as gray level invariance and rotation invariance. It compares each pixel in an image with its neighbor pixel value and saves the result as Binary number. The obtained binary bit string is used as the encoding value of the center pixel, which is the LBP eigenvalue. LBP provides a feature mode to measure the neighborhood relationship between pixels, so it can effectively extract local features of images. Moreover, due to its simple calculation, LBP can be used in real-time application scenarios based on texture classification, such as target detection and face recognition.

8. HOG feature of image feature extraction

Histogram of Oriented Gradient (HOG) feature is a feature descriptor used for object detection in computer vision and image processing. It constructs features by calculating and counting the gradient direction histogram of the local area of the image. Hog feature combined with SVM classifier has been widely used in image recognition, especially achieved great success in pedestrian detection.

9. Briefly describe the similarities and differences between SIFT and SURF algorithms

[ImG-RTF9qMY-1634884565296] [Assets /image-20210523140035410.png]

① Scale space: SIFT uses DoG pyramid to convolute with the image, and the image is downsampled. SURF convolves the image with approximate DoH pyramid (that is, box filters of different scales). With the help of integral graph, the actual operation only involves several simple addition and subtraction operations without changing the image size.

② Feature point detection: SIFT firstly performs non-maximum suppression to remove points with low contrast, and then removes edge points through Hessian matrix. SURF calculates the determinant value (DoH) of Hessian matrix and then performs non-maximum suppression.

(3) Main direction of feature points: SIFT statistical gradient direction histogram in the square neighborhood window, and the gradient amplitude is weighted, take the direction corresponding to the maximum peak; SURF is to calculate the Haar wavelet response values in x and Y directions within each sector range in the circular region, and determine the sector direction with the maximum cumulative response value.

(4) Feature descriptor: SIFT divides the neighborhood near the key point into 4×4 areas, calculates the gradient direction histogram of each sub-area, and connects it into a feature vector of 4×4×8=128 dimensions; SURF divides the 20s× 20S neighborhood into 4×4 sub-blocks, calculates the Haar wavelet response of each sub-block, and calculates the four characteristic quantities to obtain the feature vector of 4×4×4=64 dimensions.

In general, SURF and SIFT algorithms have achieved similar performance in feature point detection. SURF transforms template convolution operation into addition and subtraction operation with the help of integral graph, which is superior to SIFT feature in computing speed.

10. Compare SIFT, HOG and LBP feature extraction algorithms

[imG-3q2Nndol-1634884565302] [Assets /image-20210523140105650. PNG]

11. Name the feature detection algorithm commonly used in several traditional algorithms

①FAST: FAST Feature Detector

②STAR: STAR Feature Detector

(3) SIFT: Scale Invariant Feature Transform

(4) SURF:Speeded UP Robust Feature detection algorithm for accelerated version

⑤ORB: is short for Oriented Brief, which is an improved version of Brief algorithm with relatively good comprehensive performance.

12. Briefly explain the principle of Hough transformation

Use polar coordinates to represent a line, which can be represented by the parameters polar diameter and polar Angle (r,θ). The Hough transformation takes the form of a straight line. That is, r=xcos theta +ysin theta means that each pair of (r, theta) represents a line passing through the point (x,y). If for a given point (x,y) we plot a line passing through it in the plane of polar diameter-polar Angle, we can reach a sinusoidal curve (r>0 and 0<θ<2π). Do this for all points on the image, and if the curves intersect at two different points, it means they pass through the same straight line.

The above shows that, in general, a line can be detected by looking for the number of curves intersecting a point on the plane θ-r. The more curves that intersect at one point means that the line at that intersection is made up of more points. The threshold of points on a line can be set to define how many curves intersect at a point before a line can be considered detected.

Hough transform investigates the intersection of curves corresponding to each point in the image. If the number of curves intersecting a point exceeds the threshold value, the parameter pair (r, θ) represented by this intersection point can be considered as a straight line in the original image.

13. Briefly describe the hough circle transformation principle

The three parameters C(x0,y0,r) for conversion from a point on a plane coordinate circle to polar coordinates, where x0 and y0 are the center of the circle. When r is set to a fixed value, theta scans 360 degrees, and x and y change accordingly. If the three-dimensional curves corresponding to multiple edge points intersect at a point, then they have a cumulative maximum value at the center of the circle. The same threshold method can be used to determine whether a circle is detected.

14. Describe the main modules in OpencV.

  1. Core — Core component modules

As a Core component, Core must do a lot of things, but also relatively basic. It includes basic data structures, dynamic data structures, plotting functions, array operating-related functions, auxiliary functions and system functions and macros, XML/YML, clustering, and OpenGL interactive operations.

  1. Imgproc Image processing module

It includes image filtering, geometric image transformation, mixed image transformation, histogram, structure analysis and shape description, motion analysis and target tracking, feature and target detection.

  1. Highgui – Top-level GUI and video I/O

Includes user interface, read/write images and videos, and new QT features.

  1. Video — Video analysis

Including motion analysis and target tracking.

  1. Calib3d – Camera calibration and 3D reconstruction

Including camera calibration and 3D reconstruction.

  1. Features2d – A 2-dimensional feature framework

It includes feature detection and description, feature detection and matching interface extraction, key points and matching point drawing and object classification.

  1. Objdetect — Object detection

Including cascade classifier and SVM.

  1. MI — Machine learning

It includes statistical model, Bayesian classifier, nearest neighbor classifier, support vector machine, decision tree, promotion, gradient promotion tree, random tree, super-random tree, maximum expectation, neural network, and machine learning data.

  1. FLann — Clustering and multidimensional space search

Fast nearest neighbor search and clustering.

  1. Gpu — Gpu acceleration in computer vision

GPU module and data structure, including image processing and analysis module.

  1. Photo — Computes an image

Image repair and denoising.

  1. Kevlar — image Stitching

Image Mosaic top level operation function, rotation, automatic calibration, affine transform, seam estimation, exposure supplement and image fusion technology.

15. What does CV_8UC3 stand for in opencV?

8 represents 8 bits, UC– represents –unsigned int– unsigned integer, and 3 – represents the number of channels for an image 3

16. Briefly describe the Scalar class in OpencV

Scalar () represents an array of four elements and is used extensively in OpencV to transfer pixel values, such as RGB color values. If you don’t use the fourth argument, you don’t need to write it out. If you write only three arguments, OpencV thinks you only need to pass three arguments.

17. Briefly describe the difference between.hpp and.h

.hpp, essentially mixing the implementation code of.cpp into the.h header file. The definition and implementation are included in the same file, so the caller of this class only needs to include the.hpp file without adding CPP to the project for compilation. The implementation code will be directly compiled into the obJ file of the caller, and no separate OBJ will be generated. Using HPP will greatly reduce the number of CPP files and compilation times in the call project, and no longer need to publish lib and DLL files, so it is very suitable for writing public open source libraries.

18. Describe briefly what is optical flow?

Optical flow or optic flow is the concept of object motion detection in the field of view. A term used to describe the motion of an observed object, surface, or edge resulting from motion with respect to the observer.

19. Describe common color systems

(1) RGB is the most common color system, using a similar working mechanism of human eyes, is also used by display devices. ②HSV and HLS decompose color into hue, saturation and brightness/lightness to describe color more naturally. The algorithm can be insensitive to the illumination conditions of the input image by abandoning the last element. ③YCrCb color system is widely used in JPEG image format. ④CIELab is a perceptually uniform color space suitable for measuring the distance between two colors.

20. Briefly describe three ways to access pixels in an image

Pointer access, C []; ② Iterator; ③ Dynamic address calculation.

21. Briefly describe swelling and corrosion operations in image processing

Expansion and corrosion are performed on the white (highlighted) part. Inflation refers to the expansion of the highlighted part of the image, and the effect drawing has a larger highlighted area than the original image. Corrosion refers to the corrosion of the highlighted part of the original image, and the effect drawing has a smaller highlighted area than the original image. Mathematically speaking, expansion is to find the local maximum value and assign the maximum value to the reference point, which will gradually increase the highlight area in the image, while corrosion is the opposite.

22. Briefly describe the operation process and application scenarios of open operation

Open operation is the process of corrosion and expansion. It can be used to eliminate small objects, separate objects at slender points, and smooth the boundaries of larger objects without significantly changing their area.

23. Briefly describe the operation flow and application scenarios of closed operation

The closed operation is a process of expansion and corrosion, and the closed operation eliminates the small black holes (black areas).

24. Briefly describe the definition and application scenarios of morphological gradients

The morphologic gradient is the difference between the dilatation and corrosion plots. This operation on binary plots can highlight the edges of blobs. The morphologic gradient can be used to preserve the edges of objects.

25. Briefly describe the definition and usage scenarios of the top hat operation

The top cap is the difference between the original image and the resulting graph of the open operation. Because the result of the open operation is to magnify the cracks or locally low-brightness areas, the result of subtracting the open operation from the original image is to highlight areas that are brighter than those around the original contour. The top hat algorithm is often used to separate patches that are brighter than their neighbors. In the case that an image has a large background and small objects are more regular, the top hat operation can be used to extract the background.

26. Describe the definition and application scenarios of black hat operation

Black Hat operation is the difference between the closed operation result graph and the original image. The black Hat result highlights areas darker than those around the original contour, and this is related to the size of the selected core. So the black hat algorithm is used to separate patches that are darker than their neighbors. The renderings have perfect silhouettes.

27. Briefly describe the flood filling method

Flood filling method is a method to fill the connected area with a specific color filling algorithm, and to achieve different filling effects by setting the upper and lower limits of the connected pixels and the connected mode. Flooding is often used to mark or separate parts of an image for processing or analysis. Simply put, automatically select the area connected to the seed point, and then replace the area with the specified color.

28. Give a brief definition of affine transformations

Affine transformation, also known as affine mapping, refers to the process of a linear transformation of a vector space followed by a translation into another vector space in geometry. It preserves the flatness and parallelism of a two-dimensional graph. An arbitrary affine transformation can be represented by multiplying by a matrix (linear transformation) followed by a vector (translation).

29. Briefly describe the types and differences of some image pyramids

Generally, there are two types of image pyramids: Gaussianpyramid: the main image pyramid used for downsampling; Laplacianpyramid: used to reconstruct unsampled images from lower layers of a pyramid. In digital image processing, this is also known as predictive residuals, allowing maximum reduction of images. Used in conjunction with a gaussian pyramid. A brief difference between the two: Gauss pyramid is used to sample the image downward, Laplacian pyramid is used to sample the image upward from the lower graphics of the pyramid. Reconstruction of an image can be understood as the inverse of the Gauss pyramid.

30. Briefly define convex hull

Given a set of points on a two-dimensional plane, a convex hull is a convex polygon formed by connecting the outermost points, which can contain all points in the set. A useful way of understanding the shape or outline of an object is to calculate its convex hull.

31. Briefly describe the definition and usage scenarios of backprojection

Backprojection is a way of recording how pixels in a given image fit into the pixel distribution of a histogram model. To put it simply, firstly calculate the histogram model of a feature, and then use the model to find the feature in the image. Backprojection is used to find the point or region in the input image (usually large) that best matches a particular image (usually small), that is, to locate where the template image appears in the input image.

32. Describe the principle and application scenarios of Harris corner detection algorithm

Harris corner detection is a corner extraction algorithm directly based on gray image, high stability, especially for l-shaped corner detection accuracy. However, due to the use of Gaussian filtering, the operation speed is relatively slow, the phenomenon of corner information loss and position deviation, and corner extraction cluster phenomenon.

33. Briefly describe the watershed algorithm

Watershed algorithm is a kind of image segmentation region, in the process of segmentation, it will with the similarity between adjacent pixels as the important reference basis, which will be in close space position and gray value of similar pixels each other together constitute a closed contour, sealing ability is an important feature of watershed algorithm. The common operating steps of the watershed algorithm are as follows: the color image is gray, and then the gradient graph is obtained. Finally, the watershed algorithm is carried out on the basis of the gradient graph to obtain the edge line of the segmented image.

Refer to the link

Wenku.baidu.com/view/b859ec…

Blog.csdn.net/obsorb_know…

Blog.csdn.net/poem_qianmo…

Senitco. Making. IO / 2017/06/27 /…

Blog.csdn.net/shenziheng1…

Blog.csdn.net/qq_35372102…

zhuanlan.zhihu.com/p/41678732

Blog.csdn.net/dengheCSDN/…

Blog.csdn.net/f_zyj/artic…

Blog.csdn.net/f_zyj/artic…

Blog.csdn.net/qq_44262417…

www.cnblogs.com/sddai/p/103…

Blog.csdn.net/weixin_4619…

Blog.csdn.net/lly_117/art…

www.cnblogs.com/KAVEI/p/147…