Search CORE

25 research outputs found

Iterative Network for Image Super-Resolution

Author: Gao Wen
Liu Yuqing
Ma Siwei
Wang Shanshe
Wang Shiqi
Zhang Jian
Publication venue
Publication date: 20/05/2020
Field of study

Single image super-resolution (SISR), as a traditional ill-conditioned inverse problem, has been greatly revitalized by the recent development of convolutional neural networks (CNN). These CNN-based methods generally map a low-resolution image to its corresponding high-resolution version with sophisticated network structures and loss functions, showing impressive performances. This paper proposes a substantially different approach relying on the iterative optimization on HR space with an iterative super-resolution network (ISRN). We first analyze the observation model of image SR problem, inspiring a feasible solution by mimicking and fusing each iteration in a more general and efficient manner. Considering the drawbacks of batch normalization, we propose a feature normalization (FNorm) method to regulate the features in network. Furthermore, a novel block with F-Norm is developed to improve the network representation, termed as FNB. Residual-in-residual structure is proposed to form a very deep network, which groups FNBs with a long skip connection for better information delivery and stabling the training phase. Extensive experimental results on testing benchmarks with bicubic (BI) degradation show our ISRN can not only recover more structural information, but also achieve competitive or better PSNR/SSIM results with much fewer parameters compared to other works. Besides BI, we simulate the real-world degradation with blur-downscale (BD) and downscalenoise (DN). ISRN and its extension ISRN+ both achieve better performance than others with BD and DN degradation models.Comment: 12 pages, 14 figure

arXiv.org e-Print Archive

Progressive Multi-Scale Residual Network for Single Image Super-Resolution

Author: Gao Wen
Liu Yuqing
Ma Siwei
Wang Shanshe
Zhang Xinfeng
Publication venue
Publication date: 17/11/2020
Field of study

Multi-scale convolutional neural networks (CNNs) achieve significant success in single image super-resolution (SISR), which considers the comprehensive information from different receptive fields. However, recent multi-scale networks usually aim to build the hierarchical exploration with different sizes of filters, which lead to high computation complexity costs, and seldom focus on the inherent correlations among different scales. This paper converts the multi-scale exploration into a sequential manner, and proposes a progressive multi-scale residual network (PMRN) for SISR problem. Specifically, we devise a progressive multi-scale residual block (PMRB) to substitute the larger filters with small filter combinations, and gradually explore the hierarchical information. Furthermore, channel- and pixel-wise attention mechanism (CPA) is designed for finding the inherent correlations among image features with weighting and bias factors, which concentrates more on high-frequency information. Experimental results show that the proposed PMRN recovers structural textures more effectively with superior PSNR/SSIM results than other small networks. The extension model PMRN

^+

with self-ensemble achieves competitive or better results than large networks with much fewer parameters and lower computation complexity.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling

Author: Gao Wen
Jia Chuanmin
Ma Siwei
Wang Shanshe
Wang Zhao
Zhang Qi
Zhang Xinfeng
Publication venue
Publication date: 10/09/2023
Field of study

Video Coding for Machines (VCM) aims to compress visual signals for machine analysis. However, existing methods only consider a few machines, neglecting the majority. Moreover, the machine perceptual characteristics are not effectively leveraged, leading to suboptimal compression efficiency. In this paper, we introduce Satisfied Machine Ratio (SMR) to address these issues. SMR statistically measures the quality of compressed images and videos for machines by aggregating satisfaction scores from them. Each score is calculated based on the difference in machine perceptions between original and compressed images. Targeting image classification and object detection tasks, we build two representative machine libraries for SMR annotation and construct a large-scale SMR dataset to facilitate SMR studies. We then propose an SMR prediction model based on the correlation between deep features differences and SMR. Furthermore, we introduce an auxiliary task to increase the prediction accuracy by predicting the SMR difference between two images in different quality levels. Extensive experiments demonstrate that using the SMR models significantly improves compression performance for VCM, and the SMR models generalize well to unseen machines, traditional and neural codecs, and datasets. In summary, SMR enables perceptual coding for machines and advances VCM from specificity to generality. Code is available at \url{https://github.com/ywwynm/SMR}

arXiv.org e-Print Archive