1,009 research outputs found
Generative Prior for Unsupervised Image Restoration
The challenge of restoring real world low-quality images is due to a lack of appropriate training data and difficulty in determining how the image was degraded. Recently, generative models have demonstrated great potential for creating high- quality images by utilizing the rich and diverse information contained within the model’s trained weights and learned latent representations. One popular type of generative model is the generative adversarial network (GAN). Many new methods have been developed to harness the information found in GANs for image manipulation. Our proposed approach is to utilize generative models for both understanding the degradation of an image and restoring it. We propose using a combination of cycle consistency losses and self-attention to enhance face images by first learning the degradation and then using this information to train a style-based neural network. We also aim to use the latent representation to achieve a high level of magnification for face images (x64). By incorporating the weights of a pre-trained StyleGAN into a restoration network with a vision transformer layer, we hope to improve the current state-of-the-art in face image restoration. Finally, we present a projection-based image-denoising algorithm named Noise2Code in the latent space of the VQGAN model with a fixed-point regularization strategy. The fixed-point condition follows the observation that the pre-trained VQGAN affects the clean and noisy images in a drastically different way. Unlike previous projection-based image restoration in the latent space, both the denoising network and VQGAN model parameters are jointly trained, although the latter is not needed during the testing. We report experimental results to demonstrate that the proposed Noise2Code approach is conceptually simple, computationally efficient, and generalizable to real-world degradation scenarios
PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance
Exploiting pre-trained diffusion models for restoration has recently become a
favored alternative to the traditional task-specific training approach.
Previous works have achieved noteworthy success by limiting the solution space
using explicit degradation models. However, these methods often fall short when
faced with complex degradations as they generally cannot be precisely modeled.
In this paper, we propose PGDiff by introducing partial guidance, a fresh
perspective that is more adaptable to real-world degradations compared to
existing works. Rather than specifically defining the degradation process, our
approach models the desired properties, such as image structure and color
statistics of high-quality images, and applies this guidance during the reverse
diffusion process. These properties are readily available and make no
assumptions about the degradation process. When combined with a diffusion
prior, this partial guidance can deliver appealing results across a range of
restoration tasks. Additionally, PGDiff can be extended to handle composite
tasks by consolidating multiple high-quality image properties, achieved by
integrating the guidance from respective tasks. Experimental results
demonstrate that our method not only outperforms existing diffusion-prior-based
approaches but also competes favorably with task-specific models.Comment: GitHub: https://github.com/pq-yang/PGDif
Going the Extra Mile in Face Image Quality Assessment:A Novel Database and Model
An accurate computational model for image quality assessment (IQA) benefits
many vision applications, such as image filtering, image processing, and image
generation. Although the study of face images is an important subfield in
computer vision research, the lack of face IQA data and models limits the
precision of current IQA metrics on face image processing tasks such as face
superresolution, face enhancement, and face editing. To narrow this gap, in
this paper, we first introduce the largest annotated IQA database developed to
date, which contains 20,000 human faces -- an order of magnitude larger than
all existing rated datasets of faces -- of diverse individuals in highly varied
circumstances. Based on the database, we further propose a novel deep learning
model to accurately predict face image quality, which, for the first time,
explores the use of generative priors for IQA. By taking advantage of rich
statistics encoded in well pretrained off-the-shelf generative models, we
obtain generative prior information and use it as latent references to
facilitate blind IQA. The experimental results demonstrate both the value of
the proposed dataset for face IQA and the superior performance of the proposed
model.Comment: Appearing in IEEE TM
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
We present DiffBIR, which leverages pretrained text-to-image diffusion models
for blind image restoration problem. Our framework adopts a two-stage pipeline.
In the first stage, we pretrain a restoration module across diversified
degradations to improve generalization capability in real-world scenarios. The
second stage leverages the generative ability of latent diffusion models, to
achieve realistic image restoration. Specifically, we introduce an injective
modulation sub-network -- LAControlNet for finetuning, while the pre-trained
Stable Diffusion is to maintain its generative ability. Finally, we introduce a
controllable module that allows users to balance quality and fidelity by
introducing the latent image guidance in the denoising process during
inference. Extensive experiments have demonstrated its superiority over
state-of-the-art approaches for both blind image super-resolution and blind
face restoration tasks on synthetic and real-world datasets. The code is
available at https://github.com/XPixelGroup/DiffBIR
- …