6,027 research outputs found
Image Super-resolution with An Enhanced Group Convolutional Neural Network
CNNs with strong learning abilities are widely chosen to resolve
super-resolution problem. However, CNNs depend on deeper network architectures
to improve performance of image super-resolution, which may increase
computational cost in general. In this paper, we present an enhanced
super-resolution group CNN (ESRGCNN) with a shallow architecture by fully
fusing deep and wide channel features to extract more accurate low-frequency
information in terms of correlations of different channels in single image
super-resolution (SISR). Also, a signal enhancement operation in the ESRGCNN is
useful to inherit more long-distance contextual information for resolving
long-term dependency. An adaptive up-sampling operation is gathered into a CNN
to obtain an image super-resolution model with low-resolution images of
different sizes. Extensive experiments report that our ESRGCNN surpasses the
state-of-the-arts in terms of SISR performance, complexity, execution speed,
image quality evaluation and visual effect in SISR. Code is found at
https://github.com/hellloxiaotian/ESRGCNN
Feature-domain Adaptive Contrastive Distillation for Efficient Single Image Super-Resolution
Recently, CNN-based SISR has numerous parameters and high computational cost
to achieve better performance, limiting its applicability to
resource-constrained devices such as mobile. As one of the methods to make the
network efficient, Knowledge Distillation (KD), which transfers teacher's
useful knowledge to student, is currently being studied. More recently, KD for
SISR utilizes Feature Distillation (FD) to minimize the Euclidean distance loss
of feature maps between teacher and student networks, but it does not
sufficiently consider how to effectively and meaningfully deliver knowledge
from teacher to improve the student performance at given network capacity
constraints. In this paper, we propose a feature-domain adaptive contrastive
distillation (FACD) method for efficiently training lightweight student SISR
networks. We show the limitations of the existing FD methods using Euclidean
distance loss, and propose a feature-domain contrastive loss that makes a
student network learn richer information from the teacher's representation in
the feature domain. In addition, we propose an adaptive distillation that
selectively applies distillation depending on the conditions of the training
patches. The experimental results show that the student EDSR and RCAN networks
with the proposed FACD scheme improves not only the PSNR performance of the
entire benchmark datasets and scales, but also the subjective image quality
compared to the conventional FD approaches.Comment: Under revie
Target-adaptive CNN-based pansharpening
We recently proposed a convolutional neural network (CNN) for remote sensing
image pansharpening obtaining a significant performance gain over the state of
the art. In this paper, we explore a number of architectural and training
variations to this baseline, achieving further performance gains with a
lightweight network which trains very fast. Leveraging on this latter property,
we propose a target-adaptive usage modality which ensures a very good
performance also in the presence of a mismatch w.r.t. the training set, and
even across different sensors. The proposed method, published online as an
off-the-shelf software tool, allows users to perform fast and high-quality
CNN-based pansharpening of their own target images on general-purpose hardware
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Rich Feature Distillation with Feature Affinity Module for Efficient Image Dehazing
Single-image haze removal is a long-standing hurdle for computer vision
applications. Several works have been focused on transferring advances from
image classification, detection, and segmentation to the niche of image
dehazing, primarily focusing on contrastive learning and knowledge
distillation. However, these approaches prove computationally expensive,
raising concern regarding their applicability to on-the-edge use-cases. This
work introduces a simple, lightweight, and efficient framework for single-image
haze removal, exploiting rich "dark-knowledge" information from a lightweight
pre-trained super-resolution model via the notion of heterogeneous knowledge
distillation. We designed a feature affinity module to maximize the flow of
rich feature semantics from the super-resolution teacher to the student
dehazing network. In order to evaluate the efficacy of our proposed framework,
its performance as a plug-and-play setup to a baseline model is examined. Our
experiments are carried out on the RESIDE-Standard dataset to demonstrate the
robustness of our framework to the synthetic and real-world domains. The
extensive qualitative and quantitative results provided establish the
effectiveness of the framework, achieving gains of upto 15\% (PSNR) while
reducing the model size by 20 times.Comment: Preprint version. Accepted at Opti
Convolutional neural network based on sparse graph attention mechanism for MRI super-resolution
Magnetic resonance imaging (MRI) is a valuable clinical tool for displaying
anatomical structures and aiding in accurate diagnosis. Medical image
super-resolution (SR) reconstruction using deep learning techniques can enhance
lesion analysis and assist doctors in improving diagnostic efficiency and
accuracy. However, existing deep learning-based SR methods predominantly rely
on convolutional neural networks (CNNs), which inherently limit the expressive
capabilities of these models and therefore make it challenging to discover
potential relationships between different image features. To overcome this
limitation, we propose an A-network that utilizes multiple convolution operator
feature extraction modules (MCO) for extracting image features using multiple
convolution operators. These extracted features are passed through multiple
sets of cross-feature extraction modules (MSC) to highlight key features
through inter-channel feature interactions, enabling subsequent feature
learning. An attention-based sparse graph neural network module is incorporated
to establish relationships between pixel features, learning which adjacent
pixels have the greatest impact on determining the features to be filled. To
evaluate our model's effectiveness, we conducted experiments using different
models on data generated from multiple datasets with different degradation
multiples, and the experimental results show that our method is a significant
improvement over the current state-of-the-art methods.Comment: 12 pages, 6 figure
- …