138 research outputs found
Dual-Camera Joint Deblurring-Denoising
Recent image enhancement methods have shown the advantages of using a pair of
long and short-exposure images for low-light photography. These image
modalities offer complementary strengths and weaknesses. The former yields an
image that is clean but blurry due to camera or object motion, whereas the
latter is sharp but noisy due to low photon count. Motivated by the fact that
modern smartphones come equipped with multiple rear-facing camera sensors, we
propose a novel dual-camera method for obtaining a high-quality image. Our
method uses a synchronized burst of short exposure images captured by one
camera and a long exposure image simultaneously captured by another. Having a
synchronized short exposure burst alongside the long exposure image enables us
to (i) obtain better denoising by using a burst instead of a single image, (ii)
recover motion from the burst and use it for motion-aware deblurring of the
long exposure image, and (iii) fuse the two results to further enhance quality.
Our method is able to achieve state-of-the-art results on synthetic dual-camera
images from the GoPro dataset with five times fewer training parameters
compared to the next best method. We also show that our method qualitatively
outperforms competing approaches on real synchronized dual-camera captures.Comment: Project webpage:
http://shekshaa.github.io/Joint-Deblurring-Denoising
Face Restoration via Plug-and-Play 3D Facial Priors
State-of-the-art face restoration methods employ deep convolutional neural networks (CNNs) to learn a mapping between degraded and sharp facial patterns by exploring local appearance knowledge. However, most of these methods do not well exploit facial structures and identity information, and only deal with task-specific face restoration (e.g.,face super-resolution or deblurring). In this paper, we propose cross-tasks and cross-models plug-and-play 3D facial priors to explicitly embed the network with the sharp facial structures for general face restoration tasks. Our 3D priors are the first to explore 3D morphable knowledge based on the fusion of parametric descriptions of face attributes (e.g., identity, facial expression, texture, illumination, and face pose). Furthermore, the priors can easily be incorporated into any network and are very efficient in improving the performance and accelerating the convergence speed. Firstly, a 3D face rendering branch is set up to obtain 3D priors of salient facial structures and identity knowledge. Secondly, for better exploiting this hierarchical information (i.e., intensity similarity, 3D facial structure, and identity content), a spatial attention module is designed for image restoration problems. Extensive face restoration experiments including face super-resolution and deblurring demonstrate that the proposed 3D priors achieve superior face restoration results over the state-of-the-art algorithm
A new method for determining Wasserstein 1 optimal transport maps from Kantorovich potentials, with deep learning applications
Wasserstein 1 optimal transport maps provide a natural correspondence between
points from two probability distributions, and , which is useful in
many applications. Available algorithms for computing these maps do not appear
to scale well to high dimensions. In deep learning applications, efficient
algorithms have been developed for approximating solutions of the dual problem,
known as Kantorovich potentials, using neural networks (e.g. [Gulrajani et al.,
2017]). Importantly, such algorithms work well in high dimensions. In this
paper we present an approach towards computing Wasserstein 1 optimal transport
maps that relies only on Kantorovich potentials. In general, a Wasserstein 1
optimal transport map is not unique and is not computable from a potential
alone. Our main result is to prove that if has a density and is
supported on a submanifold of codimension at least 2, an optimal transport map
is unique and can be written explicitly in terms of a potential. These
assumptions are natural in many image processing contexts and other
applications. When the Kantorovich potential is only known approximately, our
result motivates an iterative procedure wherein data is moved in optimal
directions and with the correct average displacement. Since this provides an
approach for transforming one distribution to another, it can be used as a
multipurpose algorithm for various transport problems; we demonstrate through
several proof of concept experiments that this algorithm successfully performs
various imaging tasks, such as denoising, generation, translation and
deblurring, which normally require specialized techniques.Comment: 25 pages, 12 figures. The TTC algorithm detailed here is a simplified
and improved version of that of arXiv:2111.1509
Image Enhancement via Deep Spatial and Temporal Networks
Image enhancement is a classic problem in computer vision and has been studied for decades. It includes various subtasks such as super-resolution, image deblurring, rain removal and denoise. Among these tasks, image deblurring and rain removal have become increasingly active, as they play an important role in many areas such as autonomous driving, video surveillance and mobile applications. In addition, there exists connection between them. For example, blur and rain often degrade images simultaneously, and the performance of their removal rely on the spatial and temporal learning. To help generate sharp images and videos, in this thesis, we propose efficient algorithms based on deep neural networks for solving the problems of image deblurring and rain removal. In the first part of this thesis, we study the problem of image deblurring. Four deep learning based image deblurring methods are proposed. First, for single image deblurring, a new framework is presented which firstly learns how to transfer sharp images to realistic blurry images via a learning-to-blur Generative Adversarial Network (GAN) module, and then trains a learning-to-deblur GAN module to learn how to generate sharp images from blurry versions. In contrast to prior work which solely focuses on learning to deblur, the proposed method learns to realistically synthesize blurring effects using unpaired sharp and blurry images. Second, for video deblurring, spatio-temporal learning and adversarial training methods are used to recover sharp and realistic video frames from input blurry versions. 3D convolutional kernels on the basis of deep residual neural networks are employed to capture better spatio-temporal features, and train the proposed network with both the content loss and adversarial loss to drive the model to generate realistic frames. Third, the problem of extracting sharp image sequences from a single motion-blurred image is tackled. A detail-aware network is presented, which is a cascaded generator to handle the problems of ambiguity, subtle motion and loss of details. Finally, this thesis proposes a level-attention deblurring network, and constructs a new large-scale dataset including images with blur caused by various factors. We use this dataset to evaluate current deep deblurring methods and our proposed method. In the second part of this thesis, we study the problem of image deraining. Three deep learning based image deraining methods are proposed. First, for single image deraining, the problem of joint removal of raindrops and rain streaks is tackled. In contrast to most of prior works which solely focus on the raindrops or rain streaks removal, a dual attention-in-attention model is presented, which removes raindrops and rain streaks simultaneously. Second, for video deraining, a novel end-to-end framework is proposed to obtain the spatial representation, and temporal correlations based on ResNet-based and LSTM-based architectures, respectively. The proposed method can generate multiple deraining frames at a time, which outperforms the state-of-the-art methods in terms of quality and speed. Finally, for stereo image deraining, a deep stereo semantic-aware deraining network is proposed for the first time in computer vision. Different from the previous methods which only learn from pixel-level loss function or monocular information, the proposed network advances image deraining by leveraging semantic information and visual deviation between two views
- …