Wednesday, February 7, 2024 - 11:00 am

DISSERTATION DEFENSE

Department of Computer Science and Engineering

University of South Carolina

Author : Xiaoguang Li

Advisor : Dr. Song Wang

Date : Feb 7, 2024

Time:  11 am – 12: 30 pm

Place : Teams

Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MWJkNGI5OTYtNzk5…

Abstract

Image inpainting is an important challenge in the computer vision field. The primary goal of image inpainting is to fill in the missing parts of an image. This technique has many real-life uses including fixing old photographs and restoring ancient artworks, e.g., the degraded Dunhuang frescoes. Moreover, image inpainting is also helpful in image editing. It has the capability to eliminate unwanted objects from images while maintaining a natural and realistic appearance, e.g., removing watermarks and subtitles. Disregarding the fact that image inpainting expects the restored result to be identical to the original clean one, existing deep generative inpainting methods often treat image inpainting as a pure generative task and emphasize the naturalness or realism of the generated result. Although achieving significant progress, the existing deep generative inpainting methods are far from real-world applications due to the low generalization across different scenes. As a result, the generated images usually contain artifacts or the filled pixels differ greatly from the ground truth. To address this challenge, in this research, we propose two approaches that utilize the predictive filtering technique to improve the image inpainting performance. Furthermore, we harness the predictive filtering technique and inpainting pretraining to tackle the challenge of shadow removal effectively. Specifically, for the first approach, we formulate image inpainting as a mix of two problems, i.e., predictive filtering and deep generation. Predictive filtering is good at preserving local structures and removing artifacts but falls short to complete the large missing regions. The deep generative network can fill the numerous missing pixels based on the understanding of the whole scene but hardly restores the details identical to the original ones. To make use of their respective advantages, we propose the joint predictive filtering and generative network (JPGNet). We validate our first approach on three public datasets, i.e., Dunhuang, Places2, and CelebA, and demonstrate that our method can enhance three state-of-the-art generative methods (i.e., StructFlow, EdgeConnect, and RFRNet) significantly with slightly extra time costs. For the second approach, inspired by this inherent advantage of image-level predictive filtering, we explore the possibility of addressing image inpainting as a filtering task. We first study the advantages and challenges of image-level predictive filtering for inpainting: the method can preserve local structures and avoid artifacts but fails to fill large missing areas. Then, we propose the semantic filtering by conducting filtering on the deep feature level, which fills the missing semantic information but fails to recover the details. To address the issues while adopting the respective advantages, we propose a novel filtering technique, i.e., Multi-level Interactive Siamese Filtering (MISF). The extensive experiments demonstrate that our method surpasses state-of-the-art baselines across four metrics, i.e., L1, PSNR, SSIM, and LPIPS. In the end, we employ the predictive filtering technique and inpainting pretraining to address the shadow removal problem. Specifically, we find that pretraining shadow removal networks on the image inpainting dataset can reduce the shadow remnants significantly: a naive encoder-decoder network gets competitive restoration quality w.r.t. the state-of-the-art methods via only 10% shadow & shadow-free image pairs. After analyzing networks with/without inpainting pretraining via the information stored in the weight (IIW), we find that inpainting pretraining improves restoration quality in non-shadow regions and enhances the generalization ability of networks significantly. Additionally, shadow removal fine-tuning enables networks to fill in the details of shadow regions. Inspired by these observations we formulate shadow removal as an adaptive fusion task and propose the Inpaint4Shadow:Leveraging Inpainting for Single-Image Shadow Removal. The extensive experiments show that our method empowered with predictive filtering and inpainting outperforms all state-of-the-art shadow removal methods.