Canny Edge Detection from Scratch with Pytorch in Python
In this tutorial, we will learn about canny edge detection which is the most used edge detection filter of all. It is a multi-stage filter. We will see the implementation of this filter using Pytorch. To start, let us first introduce what a kernel is. It is a filter matrix that moves over the original matrix to give what can be taken as a filtered image.

Now we will talk about the multiple filters used in this type of canny edge detection.
- Gaussian filter: This is a filter used to remove the noises in the image by blurring the image. There are many filters to blur an image but Gaussian filter is the most used one. This filter can be made of different sizes. The bigger the size of the figure the blurrier the image will be.
import numpy as np
def get_gaussian_kernel(k=3, mu=0, sig=1, normalize=True):
gaussian_1D = np.linspace(-1, 1, k)
x, y = np.meshgrid(gaussian_1D, gaussian_1D)
distance = (x ** 2 + y ** 2) ** 0.5
gaussian_2D = np.exp(-(distance - mu) ** 2 / (2 * sigma ** 2))
gaussian_2D = gaussian_2D / (2 * np.pi *sig **2)
if normalize:
gaussian_2D = gaussian_2D / np.sum(gaussian_2D)
return gaussian_2D- Sobel filter: Now next step is to find the gradient of the image which is done using this filter. In this filter, there are two kernels used. One is a horizontal filter and second is a vertical filter. The edge can be detected by seeing if there are dark pixels or right pixels. The darker the colour is more is the possibility of a horizontal or vertical edge being there. Here we compute the gradient and then find the filtered image.
def get_sobel_kernel(k=3):
range = np.linspace(-(k // 2), k // 2, k)
x, y = np.meshgrid(range, range)
sobel_2D_numerator = x
sobel_2D_denominator = (x ** 2 + y ** 2)
sobel_2D_denominator[:, k // 2] = 1
sobel_2D = sobel_2D_numerator / sobel_2D_denominator
return sobel_2D- Non-Maximum Suppression: This is a method of filtering the image on the basis of some thresholds. To do this we will first create a 45-degree angle filter. We will find the pixels for all the values to find a local maximum which is stored all the other pixels are removed or suppressed as we can say. Here is the implementation of finding the rotated filter and finding local maxima.
def get_thin_kernels(start=0, end=360, step=45):
k_thin = 3
k_increased = k_thin + 2
# get 0° angle directional kernel
thin_kernel_0 = np.zeros((k_increased, k_increased))
thin_kernel_0[k_increased // 2, k_increased // 2] = 1
thin_kernel_0[k_increased // 2, k_increased // 2 + 1:] = -1
# rotate the 0° angle directional kernel to get the other ones
thin_kernels = []
for angle in range(start, end, step):
(h, w) = thin_kernel_0.shape
center = (w // 2, h // 2)
# apply rotation
rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1)
kernel_angle_increased = cv2.warpAffine(thin_kernel_0, rotation_matrix, (w, h), cv2.INTER_NEAREST)
kernel_angle = kernel_angle_increased[1:-1, 1:-1]
is_diag = (abs(kernel_angle) == 1)
kernel_angle = kernel_angle * is_diag
thin_kernels.append(kernel_angle)
return thin_kernels- Thresholds: We will apply three thresholds: a) We have a low-high threshold in which the values greater than the threshold is set at 1 and others are set at 0. b) Low-weak-High thresholds in which the values higher than the threshold are set as 1, the values lower than the threshold are set as 0 and the value equal to the threshold is set as 0.5. c) Low-weak-High thresholds with hysteresis in which the values higher than the threshold are set as 1, the values lower than the threshold are set as 0 and the value equal to the threshold is assigned as high or low based on the hysteresis.
Now we will implement all of the above concepts in on neural network module using PyTorch.
import torch
import torch.nn as nn
import torch.nn.functional as F
class CannyFilter(nn.Module):
def __init__(self,
k_gaussian=3,
mu=0,
sigma=1,
k_sobel=3,
use_cuda=False):
super(CannyFilter, self).__init__()
# device
self.device = 'cuda' if use_cuda else 'cpu'
# gaussian
gaussian_2D = get_gaussian_kernel(k_gaussian, mu, sigma)
self.gaussian_filter = nn.Conv2d(in_channels=1,
out_channels=1,
kernel_size=k_gaussian,
padding=k_gaussian // 2,
bias=False)
self.gaussian_filter.weight[:] = torch.from_numpy(gaussian_2D)
#sobel
sobel_2D = get_sobel_kernel(k_sobel)
self.sobel_filter_x = nn.Conv2d(in_channels=1,
out_channels=1,
kernel_size=k_sobel,
padding=k_sobel // 2,
bias=False)
self.sobel_filter_x.weight[:] = torch.from_numpy(sobel_2D)
self.sobel_filter_y = nn.Conv2d(in_channels=1,
out_channels=1,
kernel_size=k_sobel,
padding=k_sobel // 2,
bias=False)
self.sobel_filter_y.weight[:] = torch.from_numpy(sobel_2D.T)
# thin
thin_kernels = get_thin_kernels()
directional_kernels = np.stack(thin_kernels)
self.directional_filter = nn.Conv2d(in_channels=1,
out_channels=8,
kernel_size=thin_kernels[0].shape,
padding=thin_kernels[0].shape[-1] // 2,
bias=False)
self.directional_filter.weight[:, 0] = torch.from_numpy(directional_kernels)
# hysteresis
hysteresis = np.ones((3, 3)) + 0.25
self.hysteresis = nn.Conv2d(in_channels=1,
out_channels=1,
kernel_size=3,
padding=1,
bias=False)
self.hysteresis.weight[:] = torch.from_numpy(hysteresis)
def forward(self, img, low_threshold=None, high_threshold=None, hysteresis=False):
# set the setps tensors
B, C, H, W = img.shape
blurred = torch.zeros((B, C, H, W)).to(self.device)
grad_x = torch.zeros((B, 1, H, W)).to(self.device)
grad_y = torch.zeros((B, 1, H, W)).to(self.device)
grad_magnitude = torch.zeros((B, 1, H, W)).to(self.device)
grad_orientation = torch.zeros((B, 1, H, W)).to(self.device)
# gaussian
for c in range(C):
blurred[:, c:c+1] = self.gaussian_filter(img[:, c:c+1])
grad_x = grad_x + self.sobel_filter_x(blurred[:, c:c+1])
grad_y = grad_y + self.sobel_filter_y(blurred[:, c:c+1])
# thick edges
grad_x, grad_y = grad_x / C, grad_y / C
grad_magnitude = (grad_x ** 2 + grad_y ** 2) ** 0.5
grad_orientation = torch.atan(grad_y / grad_x)
grad_orientation = grad_orientation * (360 / np.pi) + 180 # convert to degree
grad_orientation = torch.round(grad_orientation / 45) * 45 # keep a split by 45
# thin edges
directional = self.directional_filter(grad_magnitude)
# get indices of positive and negative directions
positive_idx = (grad_orientation / 45) % 8
negative_idx = ((grad_orientation / 45) + 4) % 8
thin_edges = grad_magnitude.clone()
# non maximum suppression direction by direction
for pos_i in range(4):
neg_i = pos_i + 4
# get the oriented grad for the angle
is_oriented_i = (positive_idx == pos_i) * 1
is_oriented_i = is_oriented_i + (positive_idx == neg_i) * 1
pos_directional = directional[:, pos_i]
neg_directional = directional[:, neg_i]
selected_direction = torch.stack([pos_directional, neg_directional])
# get the local maximum pixels for the angle
is_max = selected_direction.min(dim=0)[0] > 0.0
is_max = torch.unsqueeze(is_max, dim=1)
# apply non maximum suppression
to_remove = (is_max == 0) * 1 * (is_oriented_i) > 0
thin_edges[to_remove] = 0.0
# thresholds
if low_threshold is not None:
low = thin_edges > low_threshold
if high_threshold is not None:
high = thin_edges > high_threshold
# get black/gray/white only
thin_edges = low * 0.5 + high * 0.5
if hysteresis:
# get weaks and check if they are high or not
weak = (thin_edges == 0.5) * 1
weak_is_high = (self.hysteresis(thin_edges) > 1) * weak
thin_edges = high * 1 + weak_is_high * 1
else:
thin_edges = low * 1
return blurred, grad_x, grad_y, grad_magnitude, grad_orientation, thin_edgesnew=CannyFilter() print(new)
OUTPUT:
CannyFilter( (gaussian_filter): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (sobel_filter_x): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (sobel_filter_y): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (directional_filter): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (hysteresis): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) )
In conclusion, we have learned in the above post how to make a neural network to detect edges using PyTorch.
Leave a Reply