Computing Convolution Kernels
The above link is a completed assignment centered around image
transformations and convolution kernels/filters. It was fascinating to
learn about convolution kernels and how to compute them. I've included a
few code snippets of related code where I think it is easy to see what
is going on with the algorithms. The function
convolve(image,kernel) takes in the image (HxW) and kernel
(hxw) then outputs pixel values according to the sum of the sub-image
multiplied by the flipped kernel. What does this all mean? For me it was
easiest to understand this by visualizing; luckily there are great
resources to do so and
here is what helped me.
def convolve(image, kernel):
# Return the convolution result: image * kernel.
# Input- image: H x W
# kernel: h x w
# Output- convolve: H x W
H, W = image.shape
h, w = kernel.shape
padded_width = h // 2 # this divides and returns quotient no remainder
padded_height = w // 2
# padded_image produces a row and col of 0's around the entire image border ,
# this ensures edges and corners contribute same to output as other points
padded_image = np.pad(image, ((padded_width, padded_width), (padded_height, padded_height)),
mode='constant', constant_values=0)
# 0's across all output image then fill them in
output = np.zeros_like(image, dtype=np.float64)
kernel_flipped = np.flipud(np.fliplr(kernel))
# loop through image h and w
for i in range(H):
for j in range(W):
# slice from padded image point i to h and j to w, same size as kernel
sub_image = padded_image[i:i + h, j:j + w]
# sum combo of subimage and kernel flipped as current pixel value
output[i, j] = np.sum(sub_image * kernel_flipped)
return output
Edge detection kernels are a great intuitive introduction to computing kernels with matrix transformations. Below you can see that the vectors represent scalar transformations when applied to an image via convolution. Ix and Iy are resulting gradients for the orthogonal directions x and y. Classic edge detection then computes the gradient magnitude to determine how "strong" the change in direction is at each point. A higher gradient magnitude means that there is a stronger change in direction.
def edge_detection(image):
# Return the gradient magnitude of the input image
# Input- image: H x W
# Output- grad_magnitude: H x W
# 1x3 vector for horizontal derivative (central difference), 1/2
kx = np.array([
[.5, 0, -.5]
])
# 3x1 vector for vertical derivative
ky = np.array([
[.5],
[0],
[-.5]
])
Ix = convolve(image, kx)
Iy = convolve(image, ky)
grad_magnitude = np.sqrt(Ix ** 2 + Iy ** 2)
return grad_magnitude, Ix, Iy
After this comes the Sobel filter; which this filter is essentially an edge detector as previously mentioned but with a larger kernel. The larger kernel provides more robustness for "noise"; meaning areas with frequent changes in direction. It's important to understand when learning about these kernels that the numbers inside the matrices simply represent scalar values. You can see below that there are higher scalar values towards the center of the kernel, which more accurately measure the gradient.
def sobel_operator(image):
# Return Gx, Gy, and the gradient magnitude.
# Input-image: H x W
# Output- Gx, Gy, grad_magnitude: H x W
Gx = convolve(image, np.array([
[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]
]))
Gy = convolve(image, np.array([
[-1, -2, -1],
[ 0, 0, 0],
[ 1, 2, 1]
]))
grad_magnitude = np.sqrt(Gx ** 2 + Gy ** 2)
return Gx, Gy, grad_magnitude
The link at the top of the page includes examples of my convolution computations working on image data. Or just click here.