First, background

There is a silver dollar on the paper, can you find its coordinate equation with the help of Canny and Hough?

To determine the coordinate equation of a circle, we first detect its edges, and then work out its relative position on the paper and its radius.

In this article we use the Canny algorithm to detect the edges of silver coins on paper.

Second, Canny algorithm

Canny can be used to get the edges of objects in the image as follows

  1. Gaussian smoothing
  2. Calculate image gradient (record its intensity and direction)
  3. Demaximization inhibition is carried out
  4. Lag edge tracking is performed

After doing the above four steps, we got the following image of the coin edge extraction on the paper

(1) Gaussian smoothing

class GaussianSmoothingNet(nn.Module) :
    def __init__(self) - >None:
        super(GaussianSmoothingNet, self).__init__()

        filter_size = 5
        # shape is (1, 5), variance is 1.0 gaussian filter kernel
        generated_filters = gaussian(filter_size,std=1.0).reshape([1,filter_size]) 

        # GFH(V): Gaussian filter of horizontal(vertical
        self.GFH = nn.Conv2d(1.1, kernel_size=(1,filter_size), padding=(0,filter_size//2))
        self.GFV = nn.Conv2d(1.1, kernel_size=(filter_size,1), padding=(filter_size//2.0))

        # Set w to a Gaussian smooth core and b to 0.0
        init_parameter(self.GFH, generated_filters, np.array([0.0])) 
        init_parameter(self.GFV, generated_filters.T, np.array([0.0])) 

    def forward(self, img) :
        img_r = img[:,0:1]  # Fetch the data of RGB three channels
        img_g = img[:,1:2]
        img_b = img[:,2:3]

        # Perform horizontal and vertical filtering on the three channels of the image
        blurred_img_r = self.GFV(self.GFH(img_r))
        blurred_img_g = self.GFV(self.GFH(img_g))
        blurred_img_b = self.GFV(self.GFH(img_b))

        # Merge into one image
        blurred_img = torch.stack([blurred_img_r, blurred_img_g, blurred_img_b], dim=1)
        blurred_img = torch.stack([torch.squeeze(blurred_img)])

        return blurred_img
Copy the code

After Gaussian smoothing (blur), the picture is more blurred than the original one, as shown on the right side of the silver coin below

See gaussian_Smoothing for the full code

(2) Sobel operator to calculate the gradient

PAI = 3.1415926

class SobelFilterNet(nn.Module) :
    def __init__(self) - >None:
        super(SobelFilterNet, self).__init__()
        sobel_filter = np.array([[-1.0.1], [...2.0.2], [...1.0.1]])
        self.SFH = nn.Conv2d(1.1, kernel_size=sobel_filter.shape, padding=sobel_filter.shape[0] / /2)
        self.SFV = nn.Conv2d(1.1, kernel_size=sobel_filter.shape, padding=sobel_filter.shape[0] / /2)

        init_parameter(self.SFH, sobel_filter, np.array([0.0]))
        init_parameter(self.SFV, sobel_filter.T, np.array([0.0]))

    def forward(self, img) :
        img_r = img[:,0:1]
        img_g = img[:,1:2]
        img_b = img[:,2:3]

        # # SFH(V): Sobel filter of horizontal(vertical
        grad_r_x = self.SFH(img_r)  The x gradient of channel R
        grad_r_y = self.SFV(img_r)
        grad_g_x = self.SFH(img_g)
        grad_g_y = self.SFV(img_g)
        grad_b_x = self.SFH(img_b)
        grad_b_y = self.SFV(img_b)

        # Calculating the magnitude of the earthquake and orientation
        magnitude_r = torch.sqrt(grad_r_x**2 + grad_r_y**2) # Gr^2 = Grx^2 + Gry^2
        magnitude_g = torch.sqrt(grad_g_x**2 + grad_g_y**2) 
        magnitude_b = torch.sqrt(grad_b_x**2 + grad_b_y**2)

        grad_magnitude = magnitude_r + magnitude_g + magnitude_b

        grad_y = grad_r_y + grad_g_y + grad_b_y
        grad_x = grad_r_x + grad_g_x + grad_b_x

        # tan theta = grad_y/grad_x convert to Angle (direction Angle)
        grad_orientation = (torch.atan2(grad_y, grad_x) * (180.0 / PAI)) 
        grad_orientation =  torch.round(grad_orientation / 45.0) * 45.0  # convert to a multiple of 45
        
        return grad_magnitude, grad_orientation
Copy the code

The gradient intensity is output as a picture and the picture on the right is obtained. It can be seen that the gradient value of the edge region of the coin is larger (the larger the gradient is, the brighter it is).

See: sobel_filter for the complete code

(3) Non-maximization inhibition

The process of non-maximization inhibition (NMS) is as follows:

  1. Will gradient intensity matrixgrad_magnitudeEach point of is used as a central pixel and compared with the gradient intensity of two adjacent points (8 in total) in the same direction or opposite direction.
  2. If the gradient at the center is less than the gradient in both directions, the gradient at the center of the point is set to 0

In the two steps above, the gradient ridge effect can be replaced with a pixel width, while preserving the gradient intensity of the ridge (the maximum gradient).

class NonMaxSupression(nn.Module) :
    def __init__(self) - >None:
        super(NonMaxSupression, self).__init__()

        all_orient_magnitude = np.stack([filter_0, filter_45, filter_90, filter_135, filter_180, filter_225, filter_270, filter_315])
        
        The Directional_Filter feature is detailed below.
        self.directional_filter = nn.Conv2d(1.8, kernel_size=filter_0.shape, padding=filter_0.shape[-1] / /2)

        init_parameter(self.directional_filter, all_filters[:, None. ] , np.zeros(shape=(all_filters.shape[0],)))

    def forward(self, grad_magnitude, grad_orientation) :

        all_orient_magnitude = self.directional_filter(grad_magnitude)     The difference between the current point gradient and the other 8 directional neighborhood points (equivalent to the second-order gradient)

        "' \ | 2/3, 4 \ | the \ | / / 1 -- -- -- -- -- -- -- -- -- -- - | -- -- -- -- -- -- -- -- -- -- -- -- 5 8 / / | \ | \ / 6 7 \ | note: every area is 45 degrees' ' '

        positive_orient = (grad_orientation / 45) % 8             # Set the positive direction type. There are eight different types of directions
        negative_orient = ((grad_orientation / 45) + 4) % 8       # +4 = 4 * 45 = 180

        height = positive_orient.size()[2]                        Get the width and height of the image
        width = positive_orient.size()[3]
        pixel_count = height * width                                # Count all pixel points of the image
        pixel_offset = torch.FloatTensor([range(pixel_count)])

        position = (positive_orient.view(-1).data * pixel_count + pixel_offset).squeeze() # Angle * number of pixels + position of the pixel

        Get the gradient of all points in the image and the gradient of their forward neighborhood points (current point gradient - forward neighborhood point gradient, judge whether the current point is the largest in the neighborhood according to its value and the magnitude of 0)
        channel_select_filtered_positive = all_orient_magnitude.view(-1)[position.long()].view(1, height, width)

        position = (negative_orient.view(-1).data * pixel_count + pixel_offset).squeeze()

        Get the gradient of the gradient of all points in the image with their opposite neighbors
        channel_select_filtered_negative = all_orient_magnitude.view(-1)[position.long()].view(1, height, width)

        # Combine into two channels
        channel_select_filtered = torch.stack([channel_select_filtered_positive, channel_select_filtered_negative])

        is_max = channel_select_filtered.min(dim=0) [0] > 0.0 # If min{current gradient - forward point gradient, current gradient - reverse point gradient} > 0, then the current gradient is maximum
        is_max = torch.unsqueeze(is_max, dim=0)

        thin_edges = grad_magnitude.clone()
        thin_edges[is_max==0] = 0.0

        return thin_edges
Copy the code

directional_filterWhat is the use of?

# enter
tensor([[[[1..1..1.],   
          [1..1..1.],   
          [1..1..1.]]]])
# output
tensor([[[[0..0..1.], 
          [0..0..1.], 
          [0..0..1.]],

         [[0..0..1.], 
          [0..0..1.], 
          [1..1..1.]],

         [[0..0..0.], 
          [0..0..0.], 
          [1..1..1.]],

         [[1..0..0.], 
          [1..0..0.], 
          [1..1..1.]],

         [[1..0..0.], 
          [1..0..0.], 
          [1..0..0.]],

         [[1..1..1.], 
          [1..0..0.], 
          [1..0..0.]],

         [[1..1..1.], 
          [0..0..0.], 
          [0..0..0.]],

         [[1..1..1.],
          [0..0..1.],
          [0..0..1.]]]], grad_fn=<ThnnConv2DBackward0>)
Copy the code

It gets the gradient values of the eight directions of the input (in the current project code, the difference between the current point gradient and the gradients of the other eight directions)

Directions are divided into eight categories based on the intensity and direction of the gradient (i.e., eight possible directions for each point), as shown in the “meter” diagram in the code above.

The procedure for calculating the gradient intensity of adjacent points in the forward neighborhood of the current point is shown below (same in the reverse direction)

Grad_orientation: 0, 1, 2, 3, 4, 5, 6, 7

Each direction gradient intensity all_orient_magnitude: [[.. the gradient direction 0..], [.. gradient direction 1..],…, [.. direction gradient of 7..]]

Therefore, for the point with direction I, its position in gradient intensity is all_orient_magnitude[I][x, Y]. After changing all_orient_magnitude into a one-dimensional vector, Position = current_orient × pixel_count + pixel_offset = position = current_orient × pixel_count + pixel_offset = position = current_orient × pixel_count + pixel_offset = position = current_orient × pixel_count + pixel_offset = position = current_orient × pixel_count + pixel_offset

The following is an auxiliary illustration:

The final effect is shown on the right below (the left is the unmaximized suppression).

For the complete code, see: nonmax_supression

(4) lag edge tracking

After thinking about it, we found that up to now there are still the following problems:

  • If there is noise in the image, edge-independent points (pseudo-edges) may appear
  • The edge point is sometimes cloudy and sometimes bright

So finally, we need to carry out lag edge tracking, and the steps are as follows:

  1. Set two thresholds (one high and one low), set the gradient intensity of the pixel whose gradient intensity is lower than the low threshold to 0, and obtain image A
  2. The gradient intensity of the pixel whose gradient intensity is less than the high threshold is set to 0 to obtain image B

As we know, due to the lower threshold value of A, the edges are relatively intact and have good continuity, but there may be more pseudo-edges. B is just the opposite of A.

Based on this, we assume that the missing pixels in B can be completed by recursive tracking based on B and supplemented by A.

to_bw = lambda image: (image > 0.0).astype(float)

class HysteresisThresholding(nn.Module) :
    def __init__(self, low_threshold=1.0, high_threshold=3.0) - >None:
        super(HysteresisThresholding, self).__init__()
        self.low_threshold = low_threshold
        self.high_threshold = high_threshold

    def thresholding(self, low_thresh: torch.Tensor, high_thresh: torch.Tensor) :
        died = torch.zeros_like(low_thresh).squeeze()
        low_thresh = low_thresh.squeeze()
        final_image = high_thresh.squeeze().clone()

        height = final_image.shape[0]
        width = final_image.shape[1]
 
        def trace(direction) :
            if not direction: return
            for x in range(1, width - 1):
                cx = x
                if direction == 'left-top' or direction == 'left-bottom':
                    cx = width - 1 - x
                left = cx - 1
                right = cx + 1
                for y in range(1, height - 1):
                    cy = y
                    if direction == 'left-top' or direction == 'right-top':
                        cy = height - 1 - y
                    top = cy - 1
                    bottom = cy + 1
                    if final_image[cy, cx]: The current point has a connection
                        if low_thresh[top, left] > 0.0: 
                            final_image[top, left] = low_thresh[top, left]   # upper left
                        if low_thresh[top, cx] > 0.0: 
                            final_image[top, cx] = low_thresh[top, cx]       # is on
                        if low_thresh[top, right] > 0.0: 
                            final_image[top, right] = low_thresh[top, right]   The upper right of the #
                        
                        if low_thresh[cy, left] > 0.0: 
                            final_image[cy, left] = low_thresh[cy, left]        # is left
                        if low_thresh[cy, right] > 0.0: 
                            final_image[cy, right] = low_thresh[cy, right]        # is right
                        
                        if low_thresh[bottom, left] > 0.0: 
                            final_image[bottom, left] = low_thresh[bottom, left]   # bottom left
                        if low_thresh[bottom, cx] > 0.0: 
                            final_image[bottom, cx] = low_thresh[bottom, cx]       The # is
                        if low_thresh[bottom, right] > 0.0: 
                            final_image[bottom, right] = low_thresh[bottom, right]   # right

        trace('right-bottom')
        trace('left-top')
        trace('right-top')
        trace('left-bottom')

        final_image = final_image.unsqueeze(dim=0).unsqueeze(dim=0)

        return final_image

    def forward(self, thin_edges) :
        low_thresholded: torch.Tensor = thin_edges.clone()
        low_thresholded[thin_edges<self.low_threshold] = 0.0

        high_threshold: torch.Tensor = thin_edges.clone()
        high_threshold[thin_edges<self.high_threshold] = 0.0

        final_thresholded = self.thresholding(low_thresholded, high_threshold)

        return low_thresholded, high_threshold, final_thresholded
Copy the code

The following figure shows the effect of low threshold and high threshold respectively

The following is the effect diagram after lag edge tracking

It can be seen that compared with the upper left figure, some pseudo-edges are eliminated, and compared with the right figure, details are more abundant.

See hysteresis_Thresholding for the complete code

Third, summary

So far, we have completed the process of edge extraction using Canny operator. In the next post we will use Hough’s algorithm to figure out the coordinates of the silver coins above.