Image Processing

Introduction

Here we explore some of the basics of image processing. We begin with the finite difference operator to derive edge detection. Afterwards we look at gaussian kernels. Using convolution, we combine our finite difference operator with a gaussian kernel to create a DoG (Difference of Gaussian) kernel. This results in a kernel which achieves more efficient edge detection. We then move to explore image frequencies through our discoveries with image filters. Using gaussian kernels, we extract the high frequency components of an image. Being able to decompose images into their low frequencies, high frequncies, and even arbitrary band filters, we learn how to sharpen images, create hybrid images, and blend images using multi-resolution blending as done in A Multiresolution Spline With Application to Image Mosaics by Burt and Adelson and Oliva's Hybrid images.

Fun with Filters

Finite Difference Operator

Approach

In this section, we look at the finite difference operator to derive edge detection. The finite difference operators Dx and Dy will be used to compute the partial derivatives of the image with respect to the x and y axis, respectively. We accomplish this by convolving the image with these operators using the convolve2d function from the scipy.signal library. The result is two images that represent the rate of change of intensity in the horizontal and vertical directions. These derivative images highlight regions in the image where significant changes in intensity occur, such as edges and boundaries.

After obtaining the derivative images, we will compute the gradient magnitude image, which gives an overall measure of the strength of edges at each pixel by combining the horizontal and vertical gradients. This gradient magnitude image is then binarized to create a clear edge map. Binarization involves choosing an appropriate threshold value that balances the trade-off between noise suppression and edge preservation. This threshold is selected qualitatively by visual inspection, iteratively adjusting it until an optimal balance is found. The final output will be a binary image that clearly delineates the edges in the original image, highlighting significant transitions while minimizing the impact of noise.

Results

Original Image

Horizontal Edges

Vertical Edges

Here we capture the difference in pixel luminance along the veritcal and horizontal axies. Computing the gradient magnitude of the image, we see the difference in intensity as we move horizontally and vertically across the image. This gives us where the edges of objects in the image are, hence the name edge detection.

The gradient magnitude can be computed using:

\[ ||\nabla I|| = \sqrt{\left( \frac{\partial I}{\partial x} \right)^2 + \left( \frac{\partial I}{\partial y} \right)^2} \]

Gradient Magnitude

Binarized Gradient

Derivative of Gaussian Filter

Approach

Building on the finite difference method, we introduce the Derivative of Gaussian (DoG) filter, which combines the noise-reducing properties of Gaussian smoothing with the edge-detection capabilities of the finite difference operator. The process begins by applying a Gaussian filter to the image, which effectively blurs it, reducing the high-frequency noise that can interfere with edge detection. The smoothed image is then processed in the same way as before—by computing its x and y derivatives using the finite difference operators Dx and Dy. This approach produces a cleaner edge map with less noise compared to the simple finite difference method.

However we can do this more efficiently thanks to the power of convolution. Rather than doing two passes, low-passing and differentiating, we can generate DoG filters by directly convolving the Gaussian filter with the finite difference operators Dx and Dy. This results in a single convolution operation that achieves the same outcome as applying the Gaussian filter followed by the finite difference operator. Both methods result in equivalent results thanks to properties of convolution.

Results

Here is a visualization of what our kernels look like:

Gaussian Kernel

Finite Difference Kernel (Dx)

Finite Difference Kernel (Dy)

We begin by taking our lovely Gaussian kernel and existing finite difference operators to create a DoG kernel. This is done by convolving the kernel with the finite difference operators, resulting in a single convolution operation that combines the Gaussian filter with the finite difference operator. The resulting DoG kernel is then applied to the original image to obtain the derivative images.

DoG Kernel Horizontal

DoG Kernel Vertical

After concolving our finite difference kernel with a our gaussian kernel, we are able to convolve our cameraman image in a single pass and see that the blur from the gaussian kernel has removed the high frequency components of the image, elimiating that pesky noise. The result is cleaner edge detection. Notice that both methods, smoothing then differentiating and convolving directly with DoG produce equivalent results.

DoG Binary Gradient

Smoothed Binary Gradient

Fun with Frequencies

Image Sharpening

Approach

Image sharpening is an image processing technique that brings out the clarity and detail of an image by emphasizing their high-frequency components. The process begins by filtering out the low frequencies of an image using a Gaussian filter. The high-frequency details of the image are then isolated by subtracting the low-frequency data from the original image. By adding these high frequencies back into the original image, we emphasize the high-frequency components, making edges and fine details more prominent.

By increasing the size and sigma of the Gaussian Kernel, we can filter out a greater portion of the high-frequency data, resulting in the isolation of more high-frequency components after the subtraction. This allows us to control the amount of sharpening applied to the image.

Results

Taj Mahal

High-Frequency Components of Taj Mahal

By computing the high frequency component of the image and adding them back to the original image, we are able to enhance the sharpness of the image.

\[\sigma = 1.0\]

\[\sigma = 2.0\]

\[\sigma = 3.0\]

\[\sigma = 4.0\]

Here we find some interesting results:

Knife

Sharpened Knife

We also find that recursively sharpening an image makes it deep fried.

Deep Fried Taj Mahal

Hybrid Images

Approach

Hybrid images are an interesting visual phenomenon where different interpretations of an image emerge depending on the viewing distance. The goal of this section is to create such hybrid images as described in Oliva 2006, by combining the high-frequency details of one image with the low-frequency content of another.

We begin by picking two images that we want to merge. Before working with any filters, it is important that the images are visually aligned, as misalignment can disrupt the intended perception of the hybrid image. Once aligned however, we extract the low-frequencies one image and the high frequencies of the other. This is done using a Gaussian Kernel and the methodology described above. These filtered images are then combined, typically by adding or averaging them, to produce the final hybrid image.

Results

Derek

Nutmeg

Low-Frequency Components

High-Frequency Components

Hybrid Derek and Nutmeg

Frequency Domain of Derek and Nutmeg

Here we select Nutmeg's low frequencies and combine them with the high frequencies of Derek's image. The results in a neat hybrid image which upclose looks like Nutmeg and far away looks like Derek.

Scary

Swapping the frequencies we choose for Derek and Nutmeg, we get an uncanny result.

Here we experiement with some different images:

Rat

Garfield

Failed Hybrid

The choice of cutoff frequencies for the filters requires experimentation to achieve the desired effect. Here, the pixel art rat lacks alot of high frequency data and when pairing it with the low frequency data from the pipe strip, we get a strong visual clash of the different frequencies.

Garfield Rat Hybrid

Fourier Transform of Hybrid Rat and Pipe

Swapping the frequencies however results in a better looking image. It is interesting to observe how the pixel art of the rat is reflected in the frequency domain.

Here is me one upping Derek and Nutmeg.

Sirius

Me and Sirius

Implementing A Multi-resolution Spline With Application to Image Mosaics

Gaussian and Laplacian Stacks

Approach

In this section, we explore Gaussian and Laplacian stacks, they key to Burt and Adelson's 1984 paper. Firstly, unlike pyramids, where each level is downsampled, stacks keep a fixed resolution. Now then, to create a Gaussian stack, we simply take an input image and recursively convolve it with a Gaussian Kernel, continually saving each convolution result to the corresponding level of the image stack. The Laplacian stack is then derived by subtracting each level of the Gaussian stack from the level above it, effectively isolating the band-pass information at each level.

Results

We begin with the follwing images:

Apple

Orange

Gaussian Stack

Now we compute the Gaussian stacks of the apple and orange. With each level in the stack, we apply an additional low-pass filter.

Apple Gaussian Level 0

Orange Gaussian Level 0

Apple Gaussian Level 2

Orange Gaussian Level 2

Apple Gaussian Level 4

Orange Gaussian Level 4

Laplacian Stack

Here we visualize the gaussian stacks of the apple and orange. With each layer, we subtract the (i)th layer from the (i+1)th layer of the gaussian stack.

Apple Laplacian Level 0

Orange Laplacian Level 0

Apple Laplacian Level 2

Orange Laplacian Level 2

Apple Laplacian Level 4

Orange Laplacian Level 4

Multi-resolution Blending

Approach

In this section, we implement A Multiresolution Spline With Application to Image Mosaics by Burt and Adelson. The final part of the project focuses on multi-resolution blending, a technique that allows for the seamless integration of two images. Unlike traditional image blending methods, multi-resolution blending works by smoothing the transition between images at different frequency levels, designed to make the seam between images almost imperceptible.

We begin by taking two input images and a mask image. We compute the Laplacian stacks of the two images in addition to the Gaussian stack of the mask image. We then multiply the corresponding layers of the Laplacian stacks with the mask, resulting in two masked stacks. Finally, we collapse and sum together the stacks of masked images for our multiresolution blended image.

Results

Masked Stacks

Taking the Laplacian stacks of the apple and orange, creating a gaussian stack of a step function as a mask, and multiplying the corresponding layers of the Laplacian stacks with the step function mask, we get the masked stacks.