Search CORE

111 research outputs found

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

Author: Aitken AP
Bishop R
Caballero J
Huszár F
Rueckert D
Shi W
Totz J
Wang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2016
Field of study

Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

Author: Aitken AP
Bishop R
Caballero J
Huszár F
Rueckert D
Shi W
Totz J
Wang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2016
Field of study

Spiral - Imperial College Digital Repository

End-to-End Learning of Video Super-Resolution with Motion Compensation

Author: A Kappeler
BC Song
C Dong
C Dong
C Liu
H Takeda
J Yang
SD Babacan
WT Freeman
Publication venue
Publication date: 03/07/2017
Field of study

Learning approaches have shown great success in the task of super-resolving an image given a low resolution input. Video super-resolution aims for exploiting additionally the information from multiple images. Typically, the images are related via optical flow and consecutive image warping. In this paper, we provide an end-to-end video super-resolution network that, in contrast to previous works, includes the estimation of optical flow in the overall network architecture. We analyze the usage of optical flow for video super-resolution and find that common off-the-shelf image warping does not allow video super-resolution to benefit much from optical flow. We rather propose an operation for motion compensation that performs warping from low to high resolution directly. We show that with this network configuration, video super-resolution can benefit from optical flow and we obtain state-of-the-art results on the popular test sets. We also show that the processing of whole images rather than independent patches is responsible for a large increase in accuracy.Comment: Accepted to GCPR201

arXiv.org e-Print Archive

Crossref

DeepSUM: Deep Neural Network for Super-Resolution of Unregistered Multitemporal Images

Author: Bordone Molini A.
Fracastoro G.
Magli E.
Valsesia D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/01/2020
Field of study

Recently, convolutional neural networks (CNNs) have been successfully applied to many remote sensing problems. However, deep learning techniques for multi-image super-resolution (SR) from multitemporal unregistered imagery have received little attention so far. This article proposes a novel CNN-based technique that exploits both spatial and temporal correlations to combine multiple images. This novel framework integrates the spatial registration task directly inside the CNN, and allows one to exploit the representation learning capabilities of the network to enhance registration accuracy. The entire SR process relies on a single CNN with three main stages: shared 2-D convolutions to extract high-dimensional features from the input images; a subnetwork proposing registration filters derived from the high-dimensional feature representations; 3-D convolutions for slow fusion of the features from multiple images. The whole network can be trained end-to-end to recover a single high-resolution image from multiple unregistered low-resolution images. The method presented in this article is the winner of the PROBA-V SR challenge issued by the European Space Agency (ESA)

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Learned Multi-View Texture Super-Resolution

Author: Cherabier Ian
Oswald Martin R.
Pollefeys Marc
Richard Audrey
Schindler Konrad
Tsiminaki Vagia
Publication venue
Publication date: 14/01/2020
Field of study

We present a super-resolution method capable of creating a high-resolution texture map for a virtual 3D object from a set of lower-resolution images of that object. Our architecture unifies the concepts of (i) multi-view super-resolution based on the redundancy of overlapping views and (ii) single-view super-resolution based on a learned prior of high-resolution (HR) image structure. The principle of multi-view super-resolution is to invert the image formation process and recover the latent HR texture from multiple lower-resolution projections. We map that inverse problem into a block of suitably designed neural network layers, and combine it with a standard encoder-decoder network for learned single-image super-resolution. Wiring the image formation model into the network avoids having to learn perspective mapping from textures to images, and elegantly handles a varying number of input views. Experiments demonstrate that the combination of multi-view observations and learned prior yields improved texture maps.Comment: 11 pages, 5 figures, 2019 International Conference on 3D Vision (3DV

arXiv.org e-Print Archive

Crossref