Search CORE

2,893 research outputs found

Efficient Deep Neural Network for Photo-realistic Image Super-Resolution

Author: Ahn Namhyuk
Kang Byungkon
Sohn Kyung-Ah
Publication venue
Publication date: 15/07/2020
Field of study

Recent progress in the deep learning-based models has improved photo-realistic (or perceptual) single-image super-resolution significantly. However, despite their powerful performance, many methods are difficult to apply to real-world applications because of the heavy computational requirements. To facilitate the use of a deep model under such demands, we focus on keeping the network efficient while maintaining its performance. In detail, we design an architecture that implements a cascading mechanism on a residual network to boost the performance with limited resources via multi-level feature fusion. In addition, our proposed model adopts group convolution and recursive scheme in order to achieve extreme efficiency. We further improve the perceptual quality of the output by employing the adversarial learning paradigm and a multi-scale discriminator approach. The performance of our method is investigated through extensive internal experiments and benchmark using various datasets. Our results show that our models outperform the recent methods with similar complexity, for both traditional pixel-based and perception-based tasks

arXiv.org e-Print Archive

A Matrix-in-matrix Neural Network for Image Super Resolution

Author: Chu Xiangxiang
Ma Hailong
Wan Shaohua
Zhang Bo
Zhang Bo
Publication venue
Publication date: 19/03/2019
Field of study

In recent years, deep learning methods have achieved impressive results with higher peak signal-to-noise ratio in single image super-resolution (SISR) tasks by utilizing deeper layers. However, their application is quite limited since they require high computing power. In addition, most of the existing methods rarely take full advantage of the intermediate features which are helpful for restoration. To address these issues, we propose a moderate-size SISR net work named matrixed channel attention network (MCAN) by constructing a matrix ensemble of multi-connected channel attention blocks (MCAB). Several models of different sizes are released to meet various practical requirements. Conclusions can be drawn from our extensive benchmark experiments that the proposed models achieve better performance with much fewer multiply-adds and parameters. Our models will be made publicly available

arXiv.org e-Print Archive

Dense xUnit Networks

Author: Kligvasser Idan
Michaeli Tomer
Publication venue
Publication date: 27/11/2018
Field of study

Deep net architectures have constantly evolved over the past few years, leading to significant advancements in a wide array of computer vision tasks. However, besides high accuracy, many applications also require a low computational load and limited memory footprint. To date, efficiency has typically been achieved either by architectural choices at the macro level (e.g. using skip connections or pruning techniques) or modifications at the level of the individual layers (e.g. using depth-wise convolutions or channel shuffle operations). Interestingly, much less attention has been devoted to the role of the activation functions in constructing efficient nets. Recently, Kligvasser et al. showed that incorporating spatial connections within the activation functions, enables a significant boost in performance in image restoration tasks, at any given budget of parameters. However, the effectiveness of their xUnit module has only been tested on simple small models, which are not characteristic of those used in high-level vision tasks. In this paper, we adopt and improve the xUnit activation, show how it can be incorporated into the DenseNet architecture, and illustrate its high effectiveness for classification and image restoration tasks alike. While the DenseNet architecture is extremely efficient to begin with, our dense xUnit net (DxNet) can typically achieve the same performance with far fewer parameters. For example, on ImageNet, our DxNet outperforms a ReLU-based DenseNet having 30% more parameters and achieves state-of-the-art results for this budget of parameters. Furthermore, in denoising and super-resolution, DxNet significantly improves upon all existing lightweight solutions, including the xUnit-based nets of Kligvasser et al

arXiv.org e-Print Archive

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

Author: Chen Chongyu
Lin Liang
Shi Yukai
Wang Keze
Xu Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/07/2017
Field of study

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials. Despite the breakthroughs of recently proposed SR methods using convolutional neural networks (CNNs), their generated results usually lack of preserving structural (high-frequency) details. In this paper, regarding global boundary context and residual context as complimentary information for enhancing structural details in image restoration, we develop a contextualized multi-task learning framework to address the SR problem. Specifically, our method first extracts convolutional features from the input LR image and applies one deconvolutional module to interpolate the LR feature maps in a content-adaptive way. Then, the resulting feature maps are fed into two branched sub-networks. During the neural network training, one sub-network outputs salient image boundaries and the HR image, and the other sub-network outputs the local residual map, i.e., the residual difference between the generated HR image and ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and BSD200), our extensive evaluations demonstrate the effectiveness of our SR method on achieving both higher restoration quality and computational efficiency compared with several state-of-the-art SR approaches. The source code and some SR results can be found at: http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201

arXiv.org e-Print Archive

Image Super-Resolution Using Deep Convolutional Networks

Author: Dong Chao
He Kaiming
Loy Chen Change
Tang Xiaoou
Publication venue
Publication date: 31/07/2015
Field of study

We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.Comment: 14 pages, 14 figures, journa

arXiv.org e-Print Archive

Feedback Network for Image Super-Resolution

Author: Jeon Gwanggil
Li Zhen
Liu Zheng
Wu Wei
Yang Jinglei
Yang Xiaomin
Publication venue
Publication date: 28/06/2019
Field of study

Recent advances in image super-resolution (SR) explored the power of deep learning to achieve a better reconstruction performance. However, the feedback mechanism, which commonly exists in human visual system, has not been fully exploited in existing deep learning based image SR methods. In this paper, we propose an image super-resolution feedback network (SRFBN) to refine low-level representations with high-level information. Specifically, we use hidden states in an RNN with constraints to achieve such feedback manner. A feedback block is designed to handle the feedback connections and to generate powerful high-level representations. The proposed SRFBN comes with a strong early reconstruction ability and can create the final high-resolution image step by step. In addition, we introduce a curriculum learning strategy to make the network well suitable for more complicated tasks, where the low-resolution images are corrupted by multiple types of degradation. Extensive experimental results demonstrate the superiority of the proposed SRFBN in comparison with the state-of-the-art methods. Code is avaliable at https://github.com/Paper99/SRFBN_CVPR19.Comment: Accepted to CVPR 201

arXiv.org e-Print Archive

EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

Author: Chan Kelvin C. K.
Dong Chao
Loy Chen Change
Wang Xintao
Yu Ke
Publication venue
Publication date: 07/05/2019
Field of study

Video restoration tasks, including super-resolution, deblurring, etc, are drawing increasing attention in the computer vision community. A challenging benchmark named REDS is released in the NTIRE19 Challenge. This new benchmark challenges existing methods from two aspects: (1) how to align multiple frames given large motions, and (2) how to effectively fuse different frames with diverse motion and blur. In this work, we propose a novel Video Restoration framework with Enhanced Deformable networks, termed EDVR, to address these challenges. First, to handle large motions, we devise a Pyramid, Cascading and Deformable (PCD) alignment module, in which frame alignment is done at the feature level using deformable convolutions in a coarse-to-fine manner. Second, we propose a Temporal and Spatial Attention (TSA) fusion module, in which attention is applied both temporally and spatially, so as to emphasize important features for subsequent restoration. Thanks to these modules, our EDVR wins the champions and outperforms the second place by a large margin in all four tracks in the NTIRE19 video restoration and enhancement challenges. EDVR also demonstrates superior performance to state-of-the-art published methods on video super-resolution and deblurring. The code is available at https://github.com/xinntao/EDVR.Comment: To appear in CVPR 2019 Workshop. The winners in all four tracks in the NTIRE 2019 video restoration and enhancement challenges. Project page: https://xinntao.github.io/projects/EDVR , Code: https://github.com/xinntao/EDV

arXiv.org e-Print Archive

Single Image Super-Resolution via Residual Neuron Attention Networks

Author: Ai Wenjie
Cheng Shilei
Tu Xiaoguang
Xie Mei
Publication venue
Publication date: 21/05/2020
Field of study

Deep Convolutional Neural Networks (DCNNs) have achieved impressive performance in Single Image Super-Resolution (SISR). To further improve the performance, existing CNN-based methods generally focus on designing deeper architecture of the network. However, we argue blindly increasing network's depth is not the most sensible way. In this paper, we propose a novel end-to-end Residual Neuron Attention Networks (RNAN) for more efficient and effective SISR. Structurally, our RNAN is a sequential integration of the well-designed Global Context-enhanced Residual Groups (GCRGs), which extracts super-resolved features from coarse to fine. Our GCRG is designed with two novelties. Firstly, the Residual Neuron Attention (RNA) mechanism is proposed in each block of GCRG to reveal the relevance of neurons for better feature representation. Furthermore, the Global Context (GC) block is embedded into RNAN at the end of each GCRG for effectively modeling the global contextual information. Experiments results demonstrate that our RNAN achieves the comparable results with state-of-the-art methods in terms of both quantitative metrics and visual quality, however, with simplified network architecture.Comment: 6 pages, 4 figures, Accepted by IEEE ICIP 202

arXiv.org e-Print Archive

A Group Variational Transformation Neural Network for Fractional Interpolation of Video Coding

Author: Hu Yueyu
Liu Jiaying
Ma Siwei
Xia Sifeng
Yang Wenhan
Publication venue
Publication date: 18/06/2018
Field of study

Motion compensation is an important technology in video coding to remove the temporal redundancy between coded video frames. In motion compensation, fractional interpolation is used to obtain more reference blocks at sub-pixel level. Existing video coding standards commonly use fixed interpolation filters for fractional interpolation, which are not efficient enough to handle diverse video signals well. In this paper, we design a group variational transformation convolutional neural network (GVTCNN) to improve the fractional interpolation performance of the luma component in motion compensation. GVTCNN infers samples at different sub-pixel positions from the input integer-position sample. It first extracts a shared feature map from the integer-position sample to infer various sub-pixel position samples. Then a group variational transformation technique is used to transform a group of copied shared feature maps to samples at different sub-pixel positions. Experimental results have identified the interpolation efficiency of our GVTCNN. Compared with the interpolation method of High Efficiency Video Coding, our method achieves 1.9% bit saving on average and up to 5.6% bit saving under low-delay P configuration.Comment: DCC 201

arXiv.org e-Print Archive

Channel Splitting Network for Single MR Image Super-Resolution

Author: Zhang Tao
Zhang Yulun
Zhao Xiaole
Zou Xueming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/09/2019
Field of study

High resolution magnetic resonance (MR) imaging is desirable in many clinical applications due to its contribution to more accurate subsequent analyses and early clinical diagnoses. Single image super resolution (SISR) is an effective and cost efficient alternative technique to improve the spatial resolution of MR images. In the past few years, SISR methods based on deep learning techniques, especially convolutional neural networks (CNNs), have achieved state-of-the-art performance on natural images. However, the information is gradually weakened and training becomes increasingly difficult as the network deepens. The problem is more serious for medical images because lacking high quality and effective training samples makes deep models prone to underfitting or overfitting. Nevertheless, many current models treat the hierarchical features on different channels equivalently, which is not helpful for the models to deal with the hierarchical features discriminatively and targetedly. To this end, we present a novel channel splitting network (CSN) to ease the representational burden of deep models. The proposed CSN model divides the hierarchical features into two branches, i.e., residual branch and dense branch, with different information transmissions. The residual branch is able to promote feature reuse, while the dense branch is beneficial to the exploration of new features. Besides, we also adopt the merge-and-run mapping to facilitate information integration between different branches. Extensive experiments on various MR images, including proton density (PD), T1 and T2 images, show that the proposed CSN model achieves superior performance over other state-of-the-art SISR methods.Comment: 13 pages, 11 figures and 4 table

arXiv.org e-Print Archive