31,964 research outputs found
Deep Back-Projection Networks for Single Image Super-resolution
Previous feed-forward architectures of recently proposed deep
super-resolution networks learn the features of low-resolution inputs and the
non-linear mapping from those to a high-resolution output. However, this
approach does not fully address the mutual dependencies of low- and
high-resolution images. We propose Deep Back-Projection Networks (DBPN), the
winner of two image super-resolution challenges (NTIRE2018 and PIRM2018), that
exploit iterative up- and down-sampling layers. These layers are formed as a
unit providing an error feedback mechanism for projection errors. We construct
mutually-connected up- and down-sampling units each of which represents
different types of low- and high-resolution components. We also show that
extending this idea to demonstrate a new insight towards more efficient network
design substantially, such as parameter sharing on the projection module and
transition layer on projection step. The experimental results yield superior
results and in particular establishing new state-of-the-art results across
multiple data sets, especially for large scaling factors such as 8x.Comment: To appear in TPAMI 2020. The code is available at
https://github.com/alterzero/DBPN-Pytorch arXiv admin note: substantial text
overlap with arXiv:1803.0273
Hierarchical Back Projection Network for Image Super-Resolution
Deep learning based single image super-resolution methods use a large number
of training datasets and have recently achieved great quality progress both
quantitatively and qualitatively. Most deep networks focus on nonlinear mapping
from low-resolution inputs to high-resolution outputs via residual learning
without exploring the feature abstraction and analysis. We propose a
Hierarchical Back Projection Network (HBPN), that cascades multiple HourGlass
(HG) modules to bottom-up and top-down process features across all scales to
capture various spatial correlations and then consolidates the best
representation for reconstruction. We adopt the back projection blocks in our
proposed network to provide the error correlated up and down-sampling process
to replace simple deconvolution and pooling process for better estimation. A
new Softmax based Weighted Reconstruction (WR) process is used to combine the
outputs of HG modules to further improve super-resolution. Experimental results
on various datasets (including the validation dataset, NTIRE2019, of the Real
Image Super-resolution Challenge) show that our proposed approach can achieve
and improve the performance of the state-of-the-art methods for different
scaling factors
Multigrid Backprojection Super-Resolution and Deep Filter Visualization
We introduce a novel deep-learning architecture for image upscaling by large
factors (e.g. 4x, 8x) based on examples of pristine high-resolution images. Our
target is to reconstruct high-resolution images from their downscale versions.
The proposed system performs a multi-level progressive upscaling, starting from
small factors (2x) and updating for higher factors (4x and 8x). The system is
recursive as it repeats the same procedure at each level. It is also residual
since we use the network to update the outputs of a classic upscaler. The
network residuals are improved by Iterative Back-Projections (IBP) computed in
the features of a convolutional network. To work in multiple levels we extend
the standard back-projection algorithm using a recursion analogous to
Multi-Grid algorithms commonly used as solvers of large systems of linear
equations. We finally show how the network can be interpreted as a standard
upsampling-and-filter upscaler with a space-variant filter that adapts to the
geometry. This approach allows us to visualize how the network learns to
upscale. Finally, our system reaches state of the art quality for models with
relatively few number of parameters.Comment: Spotlight paper in the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Recurrent Back-Projection Network for Video Super-Resolution
We proposed a novel architecture for the problem of video super-resolution.
We integrate spatial and temporal contexts from continuous video frames using a
recurrent encoder-decoder module, that fuses multi-frame information with the
more traditional, single frame super-resolution path for the target frame. In
contrast to most prior work where frames are pooled together by stacking or
warping, our model, the Recurrent Back-Projection Network (RBPN) treats each
context frame as a separate source of information. These sources are combined
in an iterative refinement framework inspired by the idea of back-projection in
multiple-image super-resolution. This is aided by explicitly representing
estimated inter-frame motion with respect to the target, rather than explicitly
aligning frames. We propose a new video super-resolution benchmark, allowing
evaluation at a larger scale and considering videos in different motion
regimes. Experimental results demonstrate that our RBPN is superior to existing
methods on several datasets.Comment: To appear in CVPR201
Feedback Network for Image Super-Resolution
Recent advances in image super-resolution (SR) explored the power of deep
learning to achieve a better reconstruction performance. However, the feedback
mechanism, which commonly exists in human visual system, has not been fully
exploited in existing deep learning based image SR methods. In this paper, we
propose an image super-resolution feedback network (SRFBN) to refine low-level
representations with high-level information. Specifically, we use hidden states
in an RNN with constraints to achieve such feedback manner. A feedback block is
designed to handle the feedback connections and to generate powerful high-level
representations. The proposed SRFBN comes with a strong early reconstruction
ability and can create the final high-resolution image step by step. In
addition, we introduce a curriculum learning strategy to make the network well
suitable for more complicated tasks, where the low-resolution images are
corrupted by multiple types of degradation. Extensive experimental results
demonstrate the superiority of the proposed SRFBN in comparison with the
state-of-the-art methods. Code is avaliable at
https://github.com/Paper99/SRFBN_CVPR19.Comment: Accepted to CVPR 201
Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery
Recovering images from undersampled linear measurements typically leads to an
ill-posed linear inverse problem, that asks for proper statistical priors.
Building effective priors is however challenged by the low train and test
overhead dictated by real-time tasks; and the need for retrieving visually
"plausible" and physically "feasible" images with minimal hallucination. To
cope with these challenges, we design a cascaded network architecture that
unrolls the proximal gradient iterations by permeating benefits from generative
residual networks (ResNet) to modeling the proximal operator. A mixture of
pixel-wise and perceptual costs is then deployed to train proximals. The
overall architecture resembles back-and-forth projection onto the intersection
of feasible and plausible images. Extensive computational experiments are
examined for a global task of reconstructing MR images of pediatric patients,
and a more local task of superresolving CelebA faces, that are insightful to
design efficient architectures. Our observations indicate that for MRI
reconstruction, a recurrent ResNet with a single residual block effectively
learns the proximal. This simple architecture appears to significantly
outperform the alternative deep ResNet architecture by 2dB SNR, and the
conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For
image superresolution, our preliminary results indicate that modeling the
denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure
Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision
Researchers have developed excellent feed-forward models that learn to map
images to desired outputs, such as to the images' latent factors, or to other
images, using supervised learning. Learning such mappings from unlabelled data,
or improving upon supervised models by exploiting unlabelled data, remains
elusive. We argue that there are two important parts to learning without
annotations: (i) matching the predictions to the input observations, and (ii)
matching the predictions to known priors. We propose Adversarial Inverse
Graphics networks (AIGNs): weakly supervised neural network models that combine
feedback from rendering their predictions, with distribution matching between
their predictions and a collection of ground-truth factors. We apply AIGNs to
3D human pose estimation and 3D structure and egomotion estimation, and
outperform models supervised by only paired annotations. We further apply AIGNs
to facial image transformation using super-resolution and inpainting renderers,
while deliberately adding biases in the ground-truth datasets. Our model
seamlessly incorporates such biases, rendering input faces towards young, old,
feminine, masculine or Tom Cruise-like equivalents (depending on the chosen
bias), or adding lip and nose augmentations while inpainting concealed lips and
noses
Meta-SR: A Magnification-Arbitrary Network for Super-Resolution
Recent research on super-resolution has achieved great success due to the
development of deep convolutional neural networks (DCNNs). However,
super-resolution of arbitrary scale factor has been ignored for a long time.
Most previous researchers regard super-resolution of different scale factors as
independent tasks. They train a specific model for each scale factor which is
inefficient in computing, and prior work only take the super-resolution of
several integer scale factors into consideration. In this work, we propose a
novel method called Meta-SR to firstly solve super-resolution of arbitrary
scale factor (including non-integer scale factors) with a single model. In our
Meta-SR, the Meta-Upscale Module is proposed to replace the traditional upscale
module. For arbitrary scale factor, the Meta-Upscale Module dynamically
predicts the weights of the upscale filters by taking the scale factor as input
and use these weights to generate the HR image of arbitrary size. For any
low-resolution image, our Meta-SR can continuously zoom in it with arbitrary
scale factor by only using a single model. We evaluated the proposed method
through extensive experiments on widely used benchmark datasets on single image
super-resolution. The experimental results show the superiority of our
Meta-Upscale.Comment: 10 pages, 4 figure
One Network to Solve Them All --- Solving Linear Inverse Problems using Deep Projection Models
While deep learning methods have achieved state-of-the-art performance in
many challenging inverse problems like image inpainting and super-resolution,
they invariably involve problem-specific training of the networks. Under this
approach, different problems require different networks. In scenarios where we
need to solve a wide variety of problems, e.g., on a mobile camera, it is
inefficient and costly to use these specially-trained networks. On the other
hand, traditional methods using signal priors can be used in all linear inverse
problems but often have worse performance on challenging tasks. In this work,
we provide a middle ground between the two kinds of methods --- we propose a
general framework to train a single deep neural network that solves arbitrary
linear inverse problems. The proposed network acts as a proximal operator for
an optimization algorithm and projects non-image signals onto the set of
natural images defined by the decision boundary of a classifier. In our
experiments, the proposed framework demonstrates superior performance over
traditional methods using a wavelet sparsity prior and achieves comparable
performance of specially-trained networks on tasks including compressive
sensing and pixel-wise inpainting
Multi-Scale Recursive and Perception-Distortion Controllable Image Super-Resolution
We describe our solution for the PIRM Super-Resolution Challenge 2018 where
we achieved the 2nd best perceptual quality for average RMSE<=16, 5th best for
RMSE<=12.5, and 7th best for RMSE<=11.5. We modify a recently proposed
Multi-Grid Back-Projection (MGBP) architecture to work as a generative system
with an input parameter that can control the amount of artificial details in
the output. We propose a discriminator for adversarial training with the
following novel properties: it is multi-scale that resembles a progressive-GAN;
it is recursive that balances the architecture of the generator; and it
includes a new layer to capture significant statistics of natural images.
Finally, we propose a training strategy that avoids conflicts between
reconstruction and perceptual losses. Our configuration uses only 281k
parameters and upscales each image of the competition in 0.2s in average.Comment: In ECCV 2018 Workshops. Won 2nd place in Region 3 of PIRM-SR
Challenge 2018. Code and models are available at
https://github.com/pnavarre/pirm-sr-201
- …