Canny Edge Detection from Scratch with Pytorch in Python
In this tutorial, we will learn about canny edge detection which is the most used edge detection filter of all. It is a multi-stage filter. We will see the implementation of this filter using Pytorch. To start, let us first introduce what a kernel is. It is a filter matrix that moves over the original matrix to give what can be taken as a filtered image.
Now we will talk about the multiple filters used in this type of canny edge detection.
- Gaussian filter: This is a filter used to remove the noises in the image by blurring the image. There are many filters to blur an image but Gaussian filter is the most used one. This filter can be made of different sizes. The bigger the size of the figure the blurrier the image will be.
import numpy as np def get_gaussian_kernel(k=3, mu=0, sig=1, normalize=True): gaussian_1D = np.linspace(-1, 1, k) x, y = np.meshgrid(gaussian_1D, gaussian_1D) distance = (x ** 2 + y ** 2) ** 0.5 gaussian_2D = np.exp(-(distance - mu) ** 2 / (2 * sigma ** 2)) gaussian_2D = gaussian_2D / (2 * np.pi *sig **2) if normalize: gaussian_2D = gaussian_2D / np.sum(gaussian_2D) return gaussian_2D
- Sobel filter: Now next step is to find the gradient of the image which is done using this filter. In this filter, there are two kernels used. One is a horizontal filter and second is a vertical filter. The edge can be detected by seeing if there are dark pixels or right pixels. The darker the colour is more is the possibility of a horizontal or vertical edge being there. Here we compute the gradient and then find the filtered image.
def get_sobel_kernel(k=3): range = np.linspace(-(k // 2), k // 2, k) x, y = np.meshgrid(range, range) sobel_2D_numerator = x sobel_2D_denominator = (x ** 2 + y ** 2) sobel_2D_denominator[:, k // 2] = 1 sobel_2D = sobel_2D_numerator / sobel_2D_denominator return sobel_2D
- Non-Maximum Suppression: This is a method of filtering the image on the basis of some thresholds. To do this we will first create a 45-degree angle filter. We will find the pixels for all the values to find a local maximum which is stored all the other pixels are removed or suppressed as we can say. Here is the implementation of finding the rotated filter and finding local maxima.
def get_thin_kernels(start=0, end=360, step=45): k_thin = 3 k_increased = k_thin + 2 # get 0° angle directional kernel thin_kernel_0 = np.zeros((k_increased, k_increased)) thin_kernel_0[k_increased // 2, k_increased // 2] = 1 thin_kernel_0[k_increased // 2, k_increased // 2 + 1:] = -1 # rotate the 0° angle directional kernel to get the other ones thin_kernels = [] for angle in range(start, end, step): (h, w) = thin_kernel_0.shape center = (w // 2, h // 2) # apply rotation rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1) kernel_angle_increased = cv2.warpAffine(thin_kernel_0, rotation_matrix, (w, h), cv2.INTER_NEAREST) kernel_angle = kernel_angle_increased[1:-1, 1:-1] is_diag = (abs(kernel_angle) == 1) kernel_angle = kernel_angle * is_diag thin_kernels.append(kernel_angle) return thin_kernels
- Thresholds: We will apply three thresholds: a) We have a low-high threshold in which the values greater than the threshold is set at 1 and others are set at 0. b) Low-weak-High thresholds in which the values higher than the threshold are set as 1, the values lower than the threshold are set as 0 and the value equal to the threshold is set as 0.5. c) Low-weak-High thresholds with hysteresis in which the values higher than the threshold are set as 1, the values lower than the threshold are set as 0 and the value equal to the threshold is assigned as high or low based on the hysteresis.
Now we will implement all of the above concepts in on neural network module using PyTorch.
import torch import torch.nn as nn import torch.nn.functional as F class CannyFilter(nn.Module): def __init__(self, k_gaussian=3, mu=0, sigma=1, k_sobel=3, use_cuda=False): super(CannyFilter, self).__init__() # device self.device = 'cuda' if use_cuda else 'cpu' # gaussian gaussian_2D = get_gaussian_kernel(k_gaussian, mu, sigma) self.gaussian_filter = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=k_gaussian, padding=k_gaussian // 2, bias=False) self.gaussian_filter.weight[:] = torch.from_numpy(gaussian_2D) #sobel sobel_2D = get_sobel_kernel(k_sobel) self.sobel_filter_x = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=k_sobel, padding=k_sobel // 2, bias=False) self.sobel_filter_x.weight[:] = torch.from_numpy(sobel_2D) self.sobel_filter_y = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=k_sobel, padding=k_sobel // 2, bias=False) self.sobel_filter_y.weight[:] = torch.from_numpy(sobel_2D.T) # thin thin_kernels = get_thin_kernels() directional_kernels = np.stack(thin_kernels) self.directional_filter = nn.Conv2d(in_channels=1, out_channels=8, kernel_size=thin_kernels[0].shape, padding=thin_kernels[0].shape[-1] // 2, bias=False) self.directional_filter.weight[:, 0] = torch.from_numpy(directional_kernels) # hysteresis hysteresis = np.ones((3, 3)) + 0.25 self.hysteresis = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, padding=1, bias=False) self.hysteresis.weight[:] = torch.from_numpy(hysteresis) def forward(self, img, low_threshold=None, high_threshold=None, hysteresis=False): # set the setps tensors B, C, H, W = img.shape blurred = torch.zeros((B, C, H, W)).to(self.device) grad_x = torch.zeros((B, 1, H, W)).to(self.device) grad_y = torch.zeros((B, 1, H, W)).to(self.device) grad_magnitude = torch.zeros((B, 1, H, W)).to(self.device) grad_orientation = torch.zeros((B, 1, H, W)).to(self.device) # gaussian for c in range(C): blurred[:, c:c+1] = self.gaussian_filter(img[:, c:c+1]) grad_x = grad_x + self.sobel_filter_x(blurred[:, c:c+1]) grad_y = grad_y + self.sobel_filter_y(blurred[:, c:c+1]) # thick edges grad_x, grad_y = grad_x / C, grad_y / C grad_magnitude = (grad_x ** 2 + grad_y ** 2) ** 0.5 grad_orientation = torch.atan(grad_y / grad_x) grad_orientation = grad_orientation * (360 / np.pi) + 180 # convert to degree grad_orientation = torch.round(grad_orientation / 45) * 45 # keep a split by 45 # thin edges directional = self.directional_filter(grad_magnitude) # get indices of positive and negative directions positive_idx = (grad_orientation / 45) % 8 negative_idx = ((grad_orientation / 45) + 4) % 8 thin_edges = grad_magnitude.clone() # non maximum suppression direction by direction for pos_i in range(4): neg_i = pos_i + 4 # get the oriented grad for the angle is_oriented_i = (positive_idx == pos_i) * 1 is_oriented_i = is_oriented_i + (positive_idx == neg_i) * 1 pos_directional = directional[:, pos_i] neg_directional = directional[:, neg_i] selected_direction = torch.stack([pos_directional, neg_directional]) # get the local maximum pixels for the angle is_max = selected_direction.min(dim=0)[0] > 0.0 is_max = torch.unsqueeze(is_max, dim=1) # apply non maximum suppression to_remove = (is_max == 0) * 1 * (is_oriented_i) > 0 thin_edges[to_remove] = 0.0 # thresholds if low_threshold is not None: low = thin_edges > low_threshold if high_threshold is not None: high = thin_edges > high_threshold # get black/gray/white only thin_edges = low * 0.5 + high * 0.5 if hysteresis: # get weaks and check if they are high or not weak = (thin_edges == 0.5) * 1 weak_is_high = (self.hysteresis(thin_edges) > 1) * weak thin_edges = high * 1 + weak_is_high * 1 else: thin_edges = low * 1 return blurred, grad_x, grad_y, grad_magnitude, grad_orientation, thin_edges
new=CannyFilter() print(new)
OUTPUT:
CannyFilter( (gaussian_filter): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (sobel_filter_x): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (sobel_filter_y): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (directional_filter): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (hysteresis): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) )
In conclusion, we have learned in the above post how to make a neural network to detect edges using PyTorch.
Leave a Reply