3,653 research outputs found
Video and Image Super-Resolution via Deep Learning with Attention Mechanism
Image demosaicing, image super-resolution and video super-resolution are three important tasks in color imaging pipeline. Demosaicing deals with the recovery of missing color information and generation of full-resolution color images from so-called Color filter Array (CFA) such as Bayer pattern. Image super-resolution aims at increasing the spatial resolution and enhance important structures (e.g., edges and textures) in super-resolved images. Both spatial and temporal dependency are important to the task of video super-resolution, which has received increasingly more attention in recent years. Traditional solutions to these three low-level vision tasks lack generalization capability especially for real-world data. Recently, deep learning methods have achieved great success in vision problems including image demosaicing and image/video super-resolution. Conceptually similar to adaptation in model-based approaches, attention has received increasing more usage in deep learning recently. As a tool to reallocate limited computational resources based on the importance of informative components, attention mechanism which includes channel attention, spatial attention, non-local attention, etc. has found successful applications in both highlevel and low-level vision tasks. However, to the best of our knowledge, 1) most approaches independently studied super-resolution and demosaicing; little is known about the potential benefit of formulating a joint demosaicing and super-resolution (JDSR) problem; 2) attention mechanism has not been studied for spectral channels of color images in the open literature; 3) current approaches for video super-resolution implement deformable convolution based frame alignment methods and naive spatial attention mechanism. How to exploit attention mechanism in spectral and temporal domains sets up the stage for the research in this dissertation. In this dissertation, we conduct a systematic study about those two issues and make the following contributions: 1) we propose a spatial color attention network (SCAN) designed to jointly exploit the spatial and spectral dependency within color images for single image super-resolution (SISR) problem. We present a spatial color attention module that calibrates important color information for individual color components from output feature maps of residual groups. Experimental results have shown that SCAN has achieved superior performance in terms of both subjective and objective qualities on the NTIRE2019 dataset; 2) we propose two competing end-to-end joint optimization solutions to the JDSR problem: Densely-Connected Squeeze-and-Excitation Residual Network (DSERN) vs. Residual-Dense Squeeze-and-Excitation Network (RDSEN). Experimental results have shown that an enhanced design RDSEN can significantly improve both subjective and objective performance over DSERN; 3) we propose a novel deep learning based framework, Deformable Kernel Spatial Attention Network (DKSAN) to super-resolve videos with a scale factor as large as 16 (the extreme SR situation). Thanks to newly designed Deformable Kernel Convolution Alignment (DKC Align) and Deformable Kernel Spatial Attention (DKSA) modules, DKSAN can get both better subjective and objective results when compared with the existing state-of-the-art approach enhanced deformable convolutional network (EDVR)
Brain MRI Super Resolution Using 3D Deep Densely Connected Neural Networks
Magnetic resonance image (MRI) in high spatial resolution provides detailed
anatomical information and is often necessary for accurate quantitative
analysis. However, high spatial resolution typically comes at the expense of
longer scan time, less spatial coverage, and lower signal to noise ratio (SNR).
Single Image Super-Resolution (SISR), a technique aimed to restore
high-resolution (HR) details from one single low-resolution (LR) input image,
has been improved dramatically by recent breakthroughs in deep learning. In
this paper, we introduce a new neural network architecture, 3D Densely
Connected Super-Resolution Networks (DCSRN) to restore HR features of
structural brain MR images. Through experiments on a dataset with 1,113
subjects, we demonstrate that our network outperforms bicubic interpolation as
well as other deep learning methods in restoring 4x resolution-reduced images.Comment: Accepted by ISBI'1
Super-Resolution for Overhead Imagery Using DenseNets and Adversarial Learning
Recent advances in Generative Adversarial Learning allow for new modalities
of image super-resolution by learning low to high resolution mappings. In this
paper we present our work using Generative Adversarial Networks (GANs) with
applications to overhead and satellite imagery. We have experimented with
several state-of-the-art architectures. We propose a GAN-based architecture
using densely connected convolutional neural networks (DenseNets) to be able to
super-resolve overhead imagery with a factor of up to 8x. We have also
investigated resolution limits of these networks. We report results on several
publicly available datasets, including SpaceNet data and IARPA Multi-View
Stereo Challenge, and compare performance with other state-of-the-art
architectures.Comment: 9 pages, 9 figures, WACV 2018 submissio
VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection
Although traffic sign detection has been studied for years and great progress
has been made with the rise of deep learning technique, there are still many
problems remaining to be addressed. For complicated real-world traffic scenes,
there are two main challenges. Firstly, traffic signs are usually small size
objects, which makes it more difficult to detect than large ones; Secondly, it
is hard to distinguish false targets which resemble real traffic signs in
complex street scenes without context information. To handle these problems, we
propose a novel end-to-end deep learning method for traffic sign detection in
complex environments. Our contributions are as follows: 1) We propose a
multi-resolution feature fusion network architecture which exploits densely
connected deconvolution layers with skip connections, and can learn more
effective features for the small size object; 2) We frame the traffic sign
detection as a spatial sequence classification and regression task, and propose
a vertical spatial sequence attention (VSSA) module to gain more context
information for better detection performance. To comprehensively evaluate the
proposed method, we do experiments on several traffic sign datasets as well as
the general object detection dataset and the results have shown the
effectiveness of our proposed method
- …