1,157 research outputs found
Extreme Channel Prior Embedded Network for Dynamic Scene Deblurring
Recent years have witnessed the significant progress on convolutional neural
networks (CNNs) in dynamic scene deblurring. While CNN models are generally
learned by the reconstruction loss defined on training data, incorporating
suitable image priors as well as regularization terms into the network
architecture could boost the deblurring performance. In this work, we propose
an Extreme Channel Prior embedded Network (ECPeNet) to plug the extreme channel
priors (i.e., priors on dark and bright channels) into a network architecture
for effective dynamic scene deblurring. A novel trainable extreme channel prior
embedded layer (ECPeL) is developed to aggregate both extreme channel and
blurry image representations, and sparse regularization is introduced to
regularize the ECPeNet model learning. Furthermore, we present an effective
multi-scale network architecture that works in both coarse-to-fine and
fine-to-coarse manners for better exploiting information flow across scales.
Experimental results on GoPro and Kohler datasets show that our proposed
ECPeNet performs favorably against state-of-the-art deep image deblurring
methods in terms of both quantitative metrics and visual quality.Comment: 10 page
Single Image Super-resolution via a Lightweight Residual Convolutional Neural Network
Recent years have witnessed great success of convolutional neural network
(CNN) for various problems both in low and high level visions. Especially
noteworthy is the residual network which was originally proposed to handle
high-level vision problems and enjoys several merits. This paper aims to extend
the merits of residual network, such as skip connection induced fast training,
for a typical low-level vision problem, i.e., single image super-resolution. In
general, the two main challenges of existing deep CNN for supper-resolution lie
in the gradient exploding/vanishing problem and large numbers of parameters or
computational cost as CNN goes deeper. Correspondingly, the skip connections or
identity mapping shortcuts are utilized to avoid gradient exploding/vanishing
problem. In addition, the skip connections have naturally centered the
activation which led to better performance. To tackle with the second problem,
a lightweight CNN architecture which has carefully designed width, depth and
skip connections was proposed. In particular, a strategy of gradually varying
the shape of network has been proposed for residual network. Different residual
architectures for image super-resolution have also been compared. Experimental
results have demonstrated that the proposed CNN model can not only achieve
state-of-the-art PSNR and SSIM results for single image super-resolution but
also produce visually pleasant results. This paper has extended the mmm 2017
oral conference paper with a considerable new analyses and more experiments
especially from the perspective of centering activations and ensemble behaviors
of residual network.Comment: Extentions of mmm 2017 pape
Data recovery in computational fluid dynamics through deep image priors
One of the challenges encountered by computational simulations at exascale is
the reliability of simulations in the face of hardware and software faults.
These faults, expected to increase with the complexity of the computational
systems, will lead to the loss of simulation data and simulation failure and
are currently addressed through a checkpoint-restart paradigm. Focusing
specifically on computational fluid dynamics simulations, this work proposes a
method that uses a deep convolutional neural network to recover simulation
data. This data recovery method (i) is agnostic to the flow configuration and
geometry, (ii) does not require extensive training data, and (iii) is accurate
for very different physical flows. Results indicate that the use of deep image
priors for data recovery is more accurate than standard recovery techniques,
such as the Gaussian process regression, also known as Kriging. Data recovery
is performed for two canonical fluid flows: laminar flow around a cylinder and
homogeneous isotropic turbulence. For data recovery of the laminar flow around
a cylinder, results indicate similar performance between the proposed method
and Gaussian process regression across a wide range of mask sizes. For
homogeneous isotropic turbulence, data recovery through the deep convolutional
neural network exhibits an error in relevant turbulent quantities approximately
three times smaller than that for the Gaussian process regression,. Forward
simulations using recovered data illustrate that the enstrophy decay is
captured within 10% using the deep convolutional neural network approach.
Although demonstrated specifically for data recovery of fluid flows, this
technique can be used in a wide range of applications, including particle image
velocimetry, visualization, and computational simulations of physical processes
beyond the Navier-Stokes equations
Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning
Single image super resolution (SR), which refers to reconstruct a
higher-resolution (HR) image from the observed low-resolution (LR) image, has
received substantial attention due to its tremendous application potentials.
Despite the breakthroughs of recently proposed SR methods using convolutional
neural networks (CNNs), their generated results usually lack of preserving
structural (high-frequency) details. In this paper, regarding global boundary
context and residual context as complimentary information for enhancing
structural details in image restoration, we develop a contextualized multi-task
learning framework to address the SR problem. Specifically, our method first
extracts convolutional features from the input LR image and applies one
deconvolutional module to interpolate the LR feature maps in a content-adaptive
way. Then, the resulting feature maps are fed into two branched sub-networks.
During the neural network training, one sub-network outputs salient image
boundaries and the HR image, and the other sub-network outputs the local
residual map, i.e., the residual difference between the generated HR image and
ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and
BSD200), our extensive evaluations demonstrate the effectiveness of our SR
method on achieving both higher restoration quality and computational
efficiency compared with several state-of-the-art SR approaches. The source
code and some SR results can be found at:
http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201
InverseNet: Solving Inverse Problems with Splitting Networks
We propose a new method that uses deep learning techniques to solve the
inverse problems. The inverse problem is cast in the form of learning an
end-to-end mapping from observed data to the ground-truth. Inspired by the
splitting strategy widely used in regularized iterative algorithm to tackle
inverse problems, the mapping is decomposed into two networks, with one
handling the inversion of the physical forward model associated with the data
term and one handling the denoising of the output from the former network,
i.e., the inverted version, associated with the prior/regularization term. The
two networks are trained jointly to learn the end-to-end mapping, getting rid
of a two-step training. The training is annealing as the intermediate variable
between these two networks bridges the gap between the input (the degraded
version of output) and output and progressively approaches to the ground-truth.
The proposed network, referred to as InverseNet, is flexible in the sense that
most of the existing end-to-end network structure can be leveraged in the first
network and most of the existing denoising network structure can be used in the
second one. Extensive experiments on both synthetic data and real datasets on
the tasks, motion deblurring, super-resolution, and colorization, demonstrate
the efficiency and accuracy of the proposed method compared with other image
processing algorithms
Face Super-Resolution Guided by 3D Facial Priors
State-of-the-art face super-resolution methods employ deep convolutional
neural networks to learn a mapping between low- and high- resolution facial
patterns by exploring local appearance knowledge. However, most of these
methods do not well exploit facial structures and identity information, and
struggle to deal with facial images that exhibit large pose variations. In this
paper, we propose a novel face super-resolution method that explicitly
incorporates 3D facial priors which grasp the sharp facial structures. Our work
is the first to explore 3D morphable knowledge based on the fusion of
parametric descriptions of face attributes (e.g., identity, facial expression,
texture, illumination, and face pose). Furthermore, the priors can easily be
incorporated into any network and are extremely efficient in improving the
performance and accelerating the convergence speed. Firstly, a 3D face
rendering branch is set up to obtain 3D priors of salient facial structures and
identity knowledge. Secondly, the Spatial Attention Module is used to better
exploit this hierarchical information (i.e., intensity similarity, 3D facial
structure, and identity content) for the super-resolution problem. Extensive
experiments demonstrate that the proposed 3D priors achieve superior face
super-resolution results over the state-of-the-arts.Comment: Accepted as a spotlight paper, European Conference on Computer Vision
2020 (ECCV
Face Restoration via Plug-and-Play 3D Facial Priors
State-of-the-art face restoration methods employ deep convolutional neural networks (CNNs) to learn a mapping between degraded and sharp facial patterns by exploring local appearance knowledge. However, most of these methods do not well exploit facial structures and identity information, and only deal with task-specific face restoration (e.g.,face super-resolution or deblurring). In this paper, we propose cross-tasks and cross-models plug-and-play 3D facial priors to explicitly embed the network with the sharp facial structures for general face restoration tasks. Our 3D priors are the first to explore 3D morphable knowledge based on the fusion of parametric descriptions of face attributes (e.g., identity, facial expression, texture, illumination, and face pose). Furthermore, the priors can easily be incorporated into any network and are very efficient in improving the performance and accelerating the convergence speed. Firstly, a 3D face rendering branch is set up to obtain 3D priors of salient facial structures and identity knowledge. Secondly, for better exploiting this hierarchical information (i.e., intensity similarity, 3D facial structure, and identity content), a spatial attention module is designed for image restoration problems. Extensive face restoration experiments including face super-resolution and deblurring demonstrate that the proposed 3D priors achieve superior face restoration results over the state-of-the-art algorithm
Prior-aware Neural Network for Partially-Supervised Multi-Organ Segmentation
Accurate multi-organ abdominal CT segmentation is essential to many clinical
applications such as computer-aided intervention. As data annotation requires
massive human labor from experienced radiologists, it is common that training
data are partially labeled, e.g., pancreas datasets only have the pancreas
labeled while leaving the rest marked as background. However, these background
labels can be misleading in multi-organ segmentation since the "background"
usually contains some other organs of interest. To address the background
ambiguity in these partially-labeled datasets, we propose Prior-aware Neural
Network (PaNN) via explicitly incorporating anatomical priors on abdominal
organ sizes, guiding the training process with domain-specific knowledge. More
specifically, PaNN assumes that the average organ size distributions in the
abdomen should approximate their empirical distributions, a prior statistics
obtained from the fully-labeled dataset. As our training objective is difficult
to be directly optimized using stochastic gradient descent [20], we propose to
reformulate it in a min-max form and optimize it via the stochastic primal-dual
gradient algorithm. PaNN achieves state-of-the-art performance on the
MICCAI2015 challenge "Multi-Atlas Labeling Beyond the Cranial Vault", a
competition on organ segmentation in the abdomen. We report an average Dice
score of 84.97%, surpassing the prior art by a large margin of 3.27%.Comment: ICCV 201
A Deep Journey into Super-resolution: A survey
Deep convolutional networks based super-resolution is a fast-growing field
with numerous practical applications. In this exposition, we extensively
compare 30+ state-of-the-art super-resolution Convolutional Neural Networks
(CNNs) over three classical and three recently introduced challenging datasets
to benchmark single image super-resolution. We introduce a taxonomy for
deep-learning based super-resolution networks that groups existing methods into
nine categories including linear, residual, multi-branch, recursive,
progressive, attention-based and adversarial designs. We also provide
comparisons between the models in terms of network complexity, memory
footprint, model input and output, learning details, the type of network losses
and important architectural differences (e.g., depth, skip-connections,
filters). The extensive evaluation performed, shows the consistent and rapid
growth in the accuracy in the past few years along with a corresponding boost
in model complexity and the availability of large-scale datasets. It is also
observed that the pioneering methods identified as the benchmark have been
significantly outperformed by the current contenders. Despite the progress in
recent years, we identify several shortcomings of existing techniques and
provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
Deep multi-frame face super-resolution
Face verification and recognition problems have seen rapid progress in recent
years, however recognition from small size images remains a challenging task
that is inherently intertwined with the task of face super-resolution. Tackling
this problem using multiple frames is an attractive idea, yet requires solving
the alignment problem that is also challenging for low-resolution faces. Here
we present a holistic system for multi-frame recognition, alignment, and
superresolution of faces. Our neural network architecture restores the central
frame of each input sequence additionally taking into account a number of
adjacent frames and making use of sub-pixel movements. We present our results
using the popular dataset for video face recognition (YouTube Faces). We show a
notable improvement of identification score compared to several baselines
including the one based on single-image super-resolution
- …