107 research outputs found

    A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal

    Full text link
    Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has witnessed great progress after stepping into the deep learning era. However, there are few works to study deep learning-based face restoration methods systematically. Thus, this paper comprehensively surveys recent advances in deep learning techniques for face restoration. Specifically, we first summarize different problem formulations and analyze the characteristic of the face image. Second, we discuss the challenges of face restoration. Concerning these challenges, we present a comprehensive review of existing FR methods, including prior based methods and deep learning-based methods. Then, we explore developed techniques in the task of FR covering network architectures, loss functions, and benchmark datasets. We also conduct a systematic benchmark evaluation on representative methods. Finally, we discuss future directions, including network designs, metrics, benchmark datasets, applications,etc. We also provide an open-source repository for all the discussed methods, which is available at https://github.com/TaoWangzj/Awesome-Face-Restoration.Comment: 21 pages, 19 figure

    Continuous Facial Motion Deblurring

    Full text link
    We introduce a novel framework for continuous facial motion deblurring that restores the continuous sharp moment latent in a single motion-blurred face image via a moment control factor. Although a motion-blurred image is the accumulated signal of continuous sharp moments during the exposure time, most existing single image deblurring approaches aim to restore a fixed number of frames using multiple networks and training stages. To address this problem, we propose a continuous facial motion deblurring network based on GAN (CFMD-GAN), which is a novel framework for restoring the continuous moment latent in a single motion-blurred face image with a single network and a single training stage. To stabilize the network training, we train the generator to restore continuous moments in the order determined by our facial motion-based reordering process (FMR) utilizing domain-specific knowledge of the face. Moreover, we propose an auxiliary regressor that helps our generator produce more accurate images by estimating continuous sharp moments. Furthermore, we introduce a control-adaptive (ContAda) block that performs spatially deformable convolution and channel-wise attention as a function of the control factor. Extensive experiments on the 300VW datasets demonstrate that the proposed framework generates a various number of continuous output frames by varying the moment control factor. Compared with the recent single-to-single image deblurring networks trained with the same 300VW training set, the proposed method show the superior performance in restoring the central sharp frame in terms of perceptual metrics, including LPIPS, FID and Arcface identity distance. The proposed method outperforms the existing single-to-video deblurring method for both qualitative and quantitative comparisons

    Image Manipulation and Image Synthesis

    Get PDF
    Image manipulation is of historic importance. Ever since the advent of photography, pictures have been manipulated for various reasons. Historic rulers often used image manipulation techniques for the purpose of self-portrayal or propaganda. In many cases, the goal is to manipulate human behaviour by spreading credible misinformation. Photographs, by their nature, portray the real world and as such are more credible to humans. However, image manipulation may not only serve evil purposes. In this thesis, we propose and analyse methods for image manipulation that serve a positive purpose. Specifically, we treat image manipulation as a tool for solving other tasks. For this, we model image manipulation as an image-to-image translation (I2I) task, i.e., a system that receives an image as input and outputs a manipulated version of the input. We propose multiple I2I based methods. We demonstrate that I2I based image manipulation methods can be used to reduce motion blur in videos. Second, we show that I2I based image manipulation methods can be used for domain adaptation and domain extension. Specifically, we present a method that significantly improves the learning of semantic segmentation from synthetic source data. The same technique can be applied to learning nighttime semantic segmentation from daylight images. Next, we show that I2I can be used to enable weakly supervised object segmentation. We show that each individual task requires and allows for different levels of supervision during the training of deep models in order to achieve best performance. We discuss the importance of maintaining control over the output of such methods and show that, with reduced levels of supervision, methods for maintaining stability during training and for establishing control over the output of a system become increasingly important. We propose multiple methods that solve the issues that arise in such systems. Finally, we demonstrate that our proposed mechanisms for control can be adapted to synthesise images from scratch

    Face Restoration via Plug-and-Play 3D Facial Priors

    Full text link
    State-of-the-art face restoration methods employ deep convolutional neural networks (CNNs) to learn a mapping between degraded and sharp facial patterns by exploring local appearance knowledge. However, most of these methods do not well exploit facial structures and identity information, and only deal with task-specific face restoration (e.g.,face super-resolution or deblurring). In this paper, we propose cross-tasks and cross-models plug-and-play 3D facial priors to explicitly embed the network with the sharp facial structures for general face restoration tasks. Our 3D priors are the first to explore 3D morphable knowledge based on the fusion of parametric descriptions of face attributes (e.g., identity, facial expression, texture, illumination, and face pose). Furthermore, the priors can easily be incorporated into any network and are very efficient in improving the performance and accelerating the convergence speed. Firstly, a 3D face rendering branch is set up to obtain 3D priors of salient facial structures and identity knowledge. Secondly, for better exploiting this hierarchical information (i.e., intensity similarity, 3D facial structure, and identity content), a spatial attention module is designed for image restoration problems. Extensive face restoration experiments including face super-resolution and deblurring demonstrate that the proposed 3D priors achieve superior face restoration results over the state-of-the-art algorithm

    DEEP LEARNING-BASED APPROACHES FOR IMAGE RESTORATION

    Get PDF
    Image restoration is the operation of taking a corrupted or degraded low-quality image and estimating a high-quality clean image that is free of degradations. The most common degradations that affect the quality of the image are blur, atmospheric turbulence, adverse weather conditions (like rain, haze, and snow), and noise. Images captured under the influence of these corruptions or degradations can significantly affect the performance of subsequent computer vision algorithms such as segmentation, recognition, object detection, and tracking. With such algorithms becoming vital components in several applications such as autonomous navigation and video surveillance, it is increasingly important to develop sophisticated algorithms to remove these degradations and high-quality clean images. These reasons have motivated a plethora of research on single image restoration methods to remove such effects. Recently, following the success of deep learning-based convolutional neural networks, many approaches have been proposed to remove the degradations from the corrupted image. We study the following single image restoration problems: (i) atmospheric turbulence removal, (ii) deblurring, (iii) removing distortions introduced by adverse weather conditions such as rain, haze, and snow, and (iv) removing noise. However, existing single image restoration techniques suffer from the following major limitations: (i) They construct global priors without taking into account that these degradations can have a different effect on different local regions of the image. (ii) They use synthetic datasets for training which often results in sub-optimal performance on the real-world images, typically because of the distributional-shift between synthetic and real-world degraded images. (iii) Existing semi-supervised approaches don't account for the effect of unlabeled or real-world degraded image on semi-supervised performance. To address these limitations, we propose supervised image restoration techniques where we use uncertainty to improve the restoration performance. To overcome the second limitation, we propose a Gaussian process-based pseudo-labeling approach to leverage the real-world rain information and train the deraininng network in a semi-supervised fashion. Furthermore, to address the third limitation we theoretically study the effect of unlabeled images on semi-supervised performance and propose an adaptive rejection technique to boost semi-supervised performance. Finally, we recognize that existing supervised and semi-supervised methods need some kind of paired labeled data to train the network, and training on any kind of synthetic paired clean-degraded images may not completely solve the domain gap between synthetic and real-world degraded image distributions. Thus we propose a self-supervised transformer-based approach for image denoising. Here, given a noisy image, we generate multiple down-sampled images and learn the joint relation between these down-sampled using the Gaussian process to denoise the image

    Learning to Deblur and Rotate Motion-Blurred Faces

    Get PDF
    We propose a solution to the novel task of rendering sharp videos from new viewpoints from a single motion-blurred image of a face. Our method1 handles the complexity of face blur by implicitly learning the geometry and motion of faces through the joint training on three large datasets: FFHQ and 300VW, which are publicly available, and a new Bern Multi-View Face Dataset (BMFD) that we built. The first two datasets provide a large variety of faces and allow our model to generalize better. BMFD instead allows us to introduce multi-view constraints, which are crucial to synthesizing sharp videos from a new camera view. It consists of high frame rate synchronized videos from multiple views of several subjects displaying a wide range of facial expressions. We use the high frame rate videos to simulate realistic motion blur through averaging. Thanks to this dataset, we train a neural network to reconstruct a 3D video representation from a single image and the corresponding face gaze. We then provide a camera viewpoint relative to the estimated gaze and the blurry image as input to an encoder-decoder network to generate a video of sharp frames with a novel camera viewpoint. We demonstrate our approach on test subjects of our multi-view dataset and VIDTIMIT
    • …
    corecore