3,181 research outputs found
Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network
Methods based on convolutional neural network (CNN) have demonstrated
tremendous improvements on single image super-resolution. However, the previous
methods mainly restore images from one single area in the low resolution (LR)
input, which limits the flexibility of models to infer various scales of
details for high resolution (HR) output. Moreover, most of them train a
specific model for each up-scale factor. In this paper, we propose a
multi-scale super resolution (MSSR) network. Our network consists of
multi-scale paths to make the HR inference, which can learn to synthesize
features from different scales. This property helps reconstruct various kinds
of regions in HR images. In addition, only one single model is needed for
multiple up-scale factors, which is more efficient without loss of restoration
quality. Experiments on four public datasets demonstrate that the proposed
method achieved state-of-the-art performance with fast speed
Adaptive Density Estimation for Generative Models
Unsupervised learning of generative models has seen tremendous progress over
recent years, in particular due to generative adversarial networks (GANs),
variational autoencoders, and flow-based models. GANs have dramatically
improved sample quality, but suffer from two drawbacks: (i) they mode-drop,
i.e., do not cover the full support of the train data, and (ii) they do not
allow for likelihood evaluations on held-out data. In contrast,
likelihood-based training encourages models to cover the full support of the
train data, but yields poorer samples. These mutual shortcomings can in
principle be addressed by training generative latent variable models in a
hybrid adversarial-likelihood manner. However, we show that commonly made
parametric assumptions create a conflict between them, making successful hybrid
models non trivial. As a solution, we propose to use deep invertible
transformations in the latent variable decoder. This approach allows for
likelihood computations in image space, is more efficient than fully invertible
models, and can take full advantage of adversarial training. We show that our
model significantly improves over existing hybrid models: offering GAN-like
samples, IS and FID scores that are competitive with fully adversarial models,
and improved likelihood scores
Recovering Faces from Portraits with Auxiliary Facial Attributes
Recovering a photorealistic face from an artistic portrait is a challenging
task since crucial facial details are often distorted or completely lost in
artistic compositions. To handle this loss, we propose an Attribute-guided Face
Recovery from Portraits (AFRP) that utilizes a Face Recovery Network (FRN) and
a Discriminative Network (DN). FRN consists of an autoencoder with residual
block-embedded skip-connections and incorporates facial attribute vectors into
the feature maps of input portraits at the bottleneck of the autoencoder. DN
has multiple convolutional and fully-connected layers, and its role is to
enforce FRN to generate authentic face images with corresponding facial
attributes dictated by the input attribute vectors. %Leveraging on the spatial
transformer networks, FRN automatically compensates for misalignments of
portraits. % and generates aligned face images. For the preservation of
identities, we impose the recovered and ground-truth faces to share similar
visual features. Specifically, DN determines whether the recovered image looks
like a real face and checks if the facial attributes extracted from the
recovered image are consistent with given attributes. %Our method can recover
high-quality photorealistic faces from unaligned portraits while preserving the
identity of the face images as well as it can reconstruct a photorealistic face
image with a desired set of attributes. Our method can recover photorealistic
identity-preserving faces with desired attributes from unseen stylized
portraits, artistic paintings, and hand-drawn sketches. On large-scale
synthesized and sketch datasets, we demonstrate that our face recovery method
achieves state-of-the-art results.Comment: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV
Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network
Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods
- …