Exploration of Deep Learning Techniques for Natural Image Matting

Natural image matting is the process of estimating the opacity mask between the foreground object and the background in any type of image. This technique has manifold applications in image and video processing and editing, as well as compositing, and has been an active research topic for many years. Due to the ill-posed nature of the problem, it is difficult to solve and even current state-of-the-art methods have not yet reached a level of performance that satisfies professional production. Therefore, in this thesis we are aiming to advance the performance of natural image matting methods and enhance their usability for professional and casual artists. First, we introduce the first generative adversarial network for natural image matting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well-composited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting.com online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production. Second, we investigate the specific problem of extracting the foreground object from an image using the predicted alpha and enhance the usability of our method. Most natural image matting algorithms only predict the alpha matte from the image, which is not sufficient to create high-quality compositions. Further, it is not possible to manually interact with these algorithms in any way except by directly changing their input or output. We propose a novel recurrent neural network that can be used as a post-processing method to recover the foreground and background colors of an image, given an initial alpha estimation. Our method outperforms the state-of-the-art in color estimation for natural image matting and shows that the recurrent nature of our method allows users to easily change candidate solutions that lead to superior color estimations. Finally, we evaluate video matting methods and propose a neural network for the video matting task. Modern natural image matting algorithms currently outperform classical video matting algorithms due to their high fidelity in predicted alphas in the individual frames of the video. However, these methods do not consider temporal consistency and therefore often introduce temporal artifacts such as flickering. We evaluate different approaches to introduce temporal consistency to these methods to make them suitable for the video matting task and propose a neural network for the video matting task and train it in a way that leverages the single image matting performance of modern algorithms while also introducing temporal consistency to reduce flickering.

Browse

All of TARA

This Collection

Statistics

Exploration of Deep Learning Techniques for Natural Image Matting

File Type:

Item Type:

Date:

Author:

Access:

Citation:

Download Item:

Abstract:

URI:

Author's Homepage:

Description:

Advisor:

Publisher:

Type of material:

URI:

Collections:

Availability:

Keywords: