771 research outputs found
Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation
Deep neural networks with alternating convolutional, max-pooling and
decimation layers are widely used in state of the art architectures for
computer vision. Max-pooling purposefully discards precise spatial information
in order to create features that are more robust, and typically organized as
lower resolution spatial feature maps. On some tasks, such as whole-image
classification, max-pooling derived features are well suited; however, for
tasks requiring precise localization, such as pixel level prediction and
segmentation, max-pooling destroys exactly the information required to perform
well. Precise localization may be preserved by shallow convnets without pooling
but at the expense of robustness. Can we have our max-pooled multi-layered cake
and eat it too? Several papers have proposed summation and concatenation based
methods for combining upsampled coarse, abstract features with finer features
to produce robust pixel level predictions. Here we introduce another model ---
dubbed Recombinator Networks --- where coarse features inform finer features
early in their formation such that finer features can make use of several
layers of computation in deciding how to use coarse features. The model is
trained once, end-to-end and performs better than summation-based
architectures, reducing the error from the previous state of the art on two
facial keypoint datasets, AFW and AFLW, by 30\% and beating the current
state-of-the-art on 300W without using extra data. We improve performance even
further by adding a denoising prediction model based on a novel convnet
formulation.Comment: accepted in CVPR 201
FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
Face Super-Resolution (SR) is a domain-specific super-resolution problem. The
specific facial prior knowledge could be leveraged for better super-resolving
face images. We present a novel deep end-to-end trainable Face Super-Resolution
Network (FSRNet), which makes full use of the geometry prior, i.e., facial
landmark heatmaps and parsing maps, to super-resolve very low-resolution (LR)
face images without well-aligned requirement. Specifically, we first construct
a coarse SR network to recover a coarse high-resolution (HR) image. Then, the
coarse HR image is sent to two branches: a fine SR encoder and a prior
information estimation network, which extracts the image features, and
estimates landmark heatmaps/parsing maps respectively. Both image features and
prior information are sent to a fine SR decoder to recover the HR image. To
further generate realistic faces, we propose the Face Super-Resolution
Generative Adversarial Network (FSRGAN) to incorporate the adversarial loss
into FSRNet. Moreover, we introduce two related tasks, face alignment and
parsing, as the new evaluation metrics for face SR, which address the
inconsistency of classic metrics w.r.t. visual perception. Extensive benchmark
experiments show that FSRNet and FSRGAN significantly outperforms state of the
arts for very LR face SR, both quantitatively and qualitatively. Code will be
made available upon publication.Comment: Chen and Tai contributed equally to this pape
- …