Face Morph Sequence

Introduction

This project explores the fascinating world of face morphing, a technique that allows for smooth transitions between two facial images. We'll delve into the process of defining facial keypoints, creating triangulation meshes, implementing affine warps, and generating morph sequences. Additionally, we'll compute mean faces from a population and create caricatures through extrapolation.

Defining Correspondences

Implementation

To implement face morphing, we need to warp the geometry of a face. This is achieved by tessellating the images, i.e., breaking them into triangles, which can then be deformed using affine transformations. To generate these triangles, we first define the key points of each face. This is done via a Python GUI tool I made which allows me to manually select/move around feature points on an image, which are then saved to disk.

Once the feature points are obtained, we use them to connect the points into triangles covering the entire image. This process is known as triangulation. While there are several methods for triangulation, for our purposes, we want to maximize the area of each triangle. This is efficiently done using Delaunay Triangulation. Fortunately, the algorithm is abstracted away into a single line of python code for us:


from scipy.spatial import Delaunay
triangulation = Delaunay(avg_shape).simplices
            

Here, avg_shape represents the average coordinates of the key points from both images, which are then passed into the Delaunay algorithm to generate the triangular mesh.

Results

Computing the "Mid-way Face"

Implementation

Given two sets of corresponding feature points and the Delaunay triangulation based on their average, we can compute the affine transformations needed to map the triangles from one face to another. For each triangle in the average shape, we calculate the affine transformation matrices that map the triangles from both images to the common shape.

Once these affine transformations are computed, we perform inverse warping to map the pixel values from the original images to the average shape. For each pixel in the target (mid-way) triangle, we trace it back to its corresponding locations in the source images and retrieve the pixel values. To generate the mid-way face, we cross-dissolve the images by taking the weighted average of the pixel intensities from both source images:

$$ I_{\text{mid}}(x, y) = \frac{I_A(x, y) + I_B(x, y)}{2} $$

where $I_A(x, y)$ and $I_B(x, y)$ represent the pixel values of the two source images at position $(x, y)$.

Results

The Morph Sequence

Implementation

In the morph sequence, instead of just averaging the points and pixels as in the mid-way face, we apply a more fine-grained interpolation. By using linear interpolation for both the key points and pixel values, we can smoothly transition from one face to another.

Let \( \alpha \in [0, 1] \) be the interpolation parameter that controls the morphing sequence. For each value of \( \alpha \), we compute the intermediate shape by linearly interpolating between the points of the two faces:

$$ P_{\text{morph}}(\alpha) = (1 - \alpha) P_A + \alpha P_B $$

Next, we warp both faces into this intermediate shape using the same affine transformation process described earlier. After warping, we blend the pixel intensities using:

$$ I_{\text{morph}}(\alpha) = (1 - \alpha) I_A + \alpha I_B $$

Results

The full morph sequence can be viewed in the following animated GIF:

The "Mean face" of a population

Implementation

We can extend the technique beyond two images by working with a larger dataset of faces. By collecting a dataset of facial images and defining feature points in a standardized way, we can perform a statistical analysis to compute the "mean face" of a population.

Here we used the FEI Face Database, Brazilian face database that contains a set of face 200 images taken at the Artificial Intelligence Laboratory of FEI in São Bernardo do Campo, São Paulo, Brazil. Of each subject, there exists 2 photos of them, one which is smiling, and one which is not smiling.

First, we gather the feature points of all faces and compute their average to obtain an “average face shape.” Then, we warp the geometry of each individual face to this average face shape using affine transformations. Finally, by averaging the pixel values of all these warped images, we produce the mean face of the population. This approach can be generalized to any group or set of images.

Results

From the dataset, we are able to compute the average face, the average smiling face, the average non-smiling face, average smiling man and woman, average non-smiling man and woman, and more. With these faces, we have their average geometry too, and are able to extrapolate features from the dataset to objects outside of them.

Here, we hypothesize that if I morph my face into the shape of the average Brazilian man, perhaps I'll look more Brazilian. We can also go in the other direction and see what the average face would look like with my face's geometry- what the average Brazilian would look like if they looked more like me.

My Face Uncropped

Me, ready to become Brazilian

Caricatures: Extrapolating from the mean

Implementation

To create caricatures, we implement an extrapolation technique that exaggerates the differences between an individual's facial features and the average face of a population. The keypoints of the individual's face are compared to those of the average face, and we apply an extrapolation function to exaggerate the differences:

$$ P_{\text{caricature}} = P_{\text{average}} + t \cdot (P_{\text{individual}} - P_{\text{average}}) $$

Here, \( t \) is the extrapolation factor, which controls the degree of exaggeration. By varying \( t \), we can control how exaggerated the caricature becomes. A higher value of \( t \) results in a more pronounced caricature. After applying this transformation, the original face is warped to match the exaggerated points, creating the caricature.

Results

By varying the extrapolation factor, we can create caricatures with different levels of exaggeration:

Extras

Please enjoy the following extras:

Please enjoy the original composition's music video (watch at your own discretion):

Conclusion

This project was a fun exercise in linear algebra, statistics, and image processing. Using affine transforms to warp our images in interesting ways gave us an exciting tool, enabling face morphing. Additionally, using a dataset of different faces within a subpopulation allowed us to perform statistical analysis on it and ask questions such as "what does the average face look like?". Pairing this with our face morphing, we were able to apply our linear algebra to infer what a face with exaggerated features would look like by extrapolating different features from the mean. Overall, this project not only provided an entertaining way to apply mathematical concepts but also highlighted the broader implications of these techniques in fields such as computer graphics and anthropology, demonstrating a fascinating intersection between mathematics, computer vision, and human perception.