Computing Convolution Kernels

Open PDF

The above link is a completed assignment centered around image transformations and convolution kernels/filters. It was fascinating to learn about convolution kernels and how to compute them. I've included a few code snippets of related code where I think it is easy to see what is going on with the algorithms. The function convolve(image,kernel) takes in the image (HxW) and kernel (hxw) then outputs pixel values according to the sum of the sub-image multiplied by the flipped kernel. What does this all mean? For me it was easiest to understand this by visualizing; luckily there are great resources to do so and here is what helped me.


def convolve(image, kernel):
    # Return the convolution result: image * kernel.
    # Input- image: H x W
    #        kernel: h x w
    # Output- convolve: H x W
    H, W = image.shape
    h, w = kernel.shape
    padded_width = h // 2  # this divides and returns quotient no remainder
    padded_height = w // 2
    # padded_image produces a row and col of 0's around the entire image border , 
    # this ensures edges and corners contribute same to output as other points
    padded_image = np.pad(image, ((padded_width, padded_width), (padded_height, padded_height)), 
                          mode='constant', constant_values=0)
    # 0's across all output image then fill them in
    output = np.zeros_like(image, dtype=np.float64)

    kernel_flipped = np.flipud(np.fliplr(kernel))
    # loop through image h and w
    for i in range(H):
        for j in range(W):
            # slice from padded image point i to h and j to w, same size as kernel
            sub_image = padded_image[i:i + h, j:j + w]
            # sum combo of subimage and kernel flipped as current pixel value
            output[i, j] = np.sum(sub_image * kernel_flipped)

    return output

Edge detection kernels are a great intuitive introduction to computing kernels with matrix transformations. Below you can see that the vectors represent scalar transformations when applied to an image via convolution. Ix and Iy are resulting gradients for the orthogonal directions x and y. Classic edge detection then computes the gradient magnitude to determine how "strong" the change in direction is at each point. A higher gradient magnitude means that there is a stronger change in direction.


def edge_detection(image):
    # Return the gradient magnitude of the input image
    # Input- image: H x W
    # Output- grad_magnitude: H x W

    # 1x3 vector for horizontal derivative (central difference), 1/2
    kx = np.array([
        [.5, 0, -.5]
    ])

    # 3x1 vector for vertical derivative
    ky = np.array([
        [.5],
        [0],
        [-.5]
    ])

    Ix = convolve(image, kx)
    Iy = convolve(image, ky)

    grad_magnitude = np.sqrt(Ix ** 2 + Iy ** 2)
    return grad_magnitude, Ix, Iy

After this comes the Sobel filter; which this filter is essentially an edge detector as previously mentioned but with a larger kernel. The larger kernel provides more robustness for "noise"; meaning areas with frequent changes in direction. It's important to understand when learning about these kernels that the numbers inside the matrices simply represent scalar values. You can see below that there are higher scalar values towards the center of the kernel, which more accurately measure the gradient.


def sobel_operator(image):
    # Return Gx, Gy, and the gradient magnitude.
    # Input-image: H x W
    # Output- Gx, Gy, grad_magnitude: H x W

    Gx = convolve(image, np.array([
        [-1, 0, 1],
        [-2, 0, 2],
        [-1, 0, 1]
    ]))
    Gy = convolve(image, np.array([
        [-1, -2, -1],
        [ 0,  0,  0],
        [ 1,  2,  1]
    ]))
    grad_magnitude = np.sqrt(Gx ** 2 + Gy ** 2)
    return Gx, Gy, grad_magnitude

The link at the top of the page includes examples of my convolution computations working on image data. Or just click here.