25 research outputs found
Iterative Network for Image Super-Resolution
Single image super-resolution (SISR), as a traditional ill-conditioned
inverse problem, has been greatly revitalized by the recent development of
convolutional neural networks (CNN). These CNN-based methods generally map a
low-resolution image to its corresponding high-resolution version with
sophisticated network structures and loss functions, showing impressive
performances. This paper proposes a substantially different approach relying on
the iterative optimization on HR space with an iterative super-resolution
network (ISRN). We first analyze the observation model of image SR problem,
inspiring a feasible solution by mimicking and fusing each iteration in a more
general and efficient manner. Considering the drawbacks of batch normalization,
we propose a feature normalization (FNorm) method to regulate the features in
network. Furthermore, a novel block with F-Norm is developed to improve the
network representation, termed as FNB. Residual-in-residual structure is
proposed to form a very deep network, which groups FNBs with a long skip
connection for better information delivery and stabling the training phase.
Extensive experimental results on testing benchmarks with bicubic (BI)
degradation show our ISRN can not only recover more structural information, but
also achieve competitive or better PSNR/SSIM results with much fewer parameters
compared to other works. Besides BI, we simulate the real-world degradation
with blur-downscale (BD) and downscalenoise (DN). ISRN and its extension ISRN+
both achieve better performance than others with BD and DN degradation models.Comment: 12 pages, 14 figure
Progressive Multi-Scale Residual Network for Single Image Super-Resolution
Multi-scale convolutional neural networks (CNNs) achieve significant success
in single image super-resolution (SISR), which considers the comprehensive
information from different receptive fields. However, recent multi-scale
networks usually aim to build the hierarchical exploration with different sizes
of filters, which lead to high computation complexity costs, and seldom focus
on the inherent correlations among different scales. This paper converts the
multi-scale exploration into a sequential manner, and proposes a progressive
multi-scale residual network (PMRN) for SISR problem. Specifically, we devise a
progressive multi-scale residual block (PMRB) to substitute the larger filters
with small filter combinations, and gradually explore the hierarchical
information. Furthermore, channel- and pixel-wise attention mechanism (CPA) is
designed for finding the inherent correlations among image features with
weighting and bias factors, which concentrates more on high-frequency
information. Experimental results show that the proposed PMRN recovers
structural textures more effectively with superior PSNR/SSIM results than other
small networks. The extension model PMRN with self-ensemble achieves
competitive or better results than large networks with much fewer parameters
and lower computation complexity.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling
Video Coding for Machines (VCM) aims to compress visual signals for machine
analysis. However, existing methods only consider a few machines, neglecting
the majority. Moreover, the machine perceptual characteristics are not
effectively leveraged, leading to suboptimal compression efficiency. In this
paper, we introduce Satisfied Machine Ratio (SMR) to address these issues. SMR
statistically measures the quality of compressed images and videos for machines
by aggregating satisfaction scores from them. Each score is calculated based on
the difference in machine perceptions between original and compressed images.
Targeting image classification and object detection tasks, we build two
representative machine libraries for SMR annotation and construct a large-scale
SMR dataset to facilitate SMR studies. We then propose an SMR prediction model
based on the correlation between deep features differences and SMR.
Furthermore, we introduce an auxiliary task to increase the prediction accuracy
by predicting the SMR difference between two images in different quality
levels. Extensive experiments demonstrate that using the SMR models
significantly improves compression performance for VCM, and the SMR models
generalize well to unseen machines, traditional and neural codecs, and
datasets. In summary, SMR enables perceptual coding for machines and advances
VCM from specificity to generality. Code is available at
\url{https://github.com/ywwynm/SMR}