163 research outputs found
Accurate and lightweight dehazing via multi-receptive-field non-local network and novel contrastive regularization
Recently, deep learning-based methods have dominated image dehazing domain.
Although very competitive dehazing performance has been achieved with
sophisticated models, effective solutions for extracting useful features are
still under-explored. In addition, non-local network, which has made a
breakthrough in many vision tasks, has not been appropriately applied to image
dehazing. Thus, a multi-receptive-field non-local network (MRFNLN) consisting
of the multi-stream feature attention block (MSFAB) and cross non-local block
(CNLB) is presented in this paper. We start with extracting richer features for
dehazing. Specifically, we design a multi-stream feature extraction (MSFE)
sub-block, which contains three parallel convolutions with different receptive
fields (i.e., , , ) for extracting multi-scale
features. Following MSFE, we employ an attention sub-block to make the model
adaptively focus on important channels/regions. The MSFE and attention
sub-blocks constitute our MSFAB. Then, we design a cross non-local block
(CNLB), which can capture long-range dependencies beyond the query. Instead of
the same input source of query branch, the key and value branches are enhanced
by fusing more preceding features. CNLB is computation-friendly by leveraging a
spatial pyramid down-sampling (SPDS) strategy to reduce the computation and
memory consumption without sacrificing the performance. Last but not least, a
novel detail-focused contrastive regularization (DFCR) is presented by
emphasizing the low-level details and ignoring the high-level semantic
information in the representation space. Comprehensive experimental results
demonstrate that the proposed MRFNLN model outperforms recent state-of-the-art
dehazing methods with less than 1.5 Million parameters.Comment: submitted to IEEE TCYB for possible publicatio
MixDehazeNet : Mix Structure Block For Image Dehazing Network
Image dehazing is a typical task in the low-level vision field. Previous
studies verified the effectiveness of the large convolutional kernel and
attention mechanism in dehazing. However, there are two drawbacks: the
multi-scale properties of an image are readily ignored when a large
convolutional kernel is introduced, and the standard series connection of an
attention module does not sufficiently consider an uneven hazy distribution. In
this paper, we propose a novel framework named Mix Structure Image Dehazing
Network (MixDehazeNet), which solves two issues mentioned above. Specifically,
it mainly consists of two parts: the multi-scale parallel large convolution
kernel module and the enhanced parallel attention module. Compared with a
single large kernel, parallel large kernels with multi-scale are more capable
of taking partial texture into account during the dehazing phase. In addition,
an enhanced parallel attention module is developed, in which parallel
connections of attention perform better at dehazing uneven hazy distribution.
Extensive experiments on three benchmarks demonstrate the effectiveness of our
proposed methods. For example, compared with the previous state-of-the-art
methods, MixDehazeNet achieves a significant improvement (42.62dB PSNR) on the
SOTS indoor dataset. The code is released in
https://github.com/AmeryXiong/MixDehazeNet
Restoring Vision in Hazy Weather with Hierarchical Contrastive Learning
Image restoration under hazy weather condition, which is called single image
dehazing, has been of significant interest for various computer vision
applications. In recent years, deep learning-based methods have achieved
success. However, existing image dehazing methods typically neglect the
hierarchy of features in the neural network and fail to exploit their
relationships fully. To this end, we propose an effective image dehazing method
named Hierarchical Contrastive Dehazing (HCD), which is based on feature fusion
and contrastive learning strategies. HCD consists of a hierarchical dehazing
network (HDN) and a novel hierarchical contrastive loss (HCL). Specifically,
the core design in the HDN is a hierarchical interaction module, which utilizes
multi-scale activation to revise the feature responses hierarchically. To
cooperate with the training of HDN, we propose HCL which performs contrastive
learning on hierarchically paired exemplars, facilitating haze removal.
Extensive experiments on public datasets, RESIDE, HazeRD, and DENSE-HAZE,
demonstrate that HCD quantitatively outperforms the state-of-the-art methods in
terms of PSNR, SSIM and achieves better visual quality.Comment: 30 pages, 10 figure
Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images
Modeling statistical regularity plays an essential role in ill-posed image
processing problems. Recently, deep learning based methods have been presented
to implicitly learn statistical representation of pixel distributions in
natural images and leverage it as a constraint to facilitate subsequent tasks,
such as color constancy and image dehazing. However, the existing CNN
architecture is prone to variability and diversity of pixel intensity within
and between local regions, which may result in inaccurate statistical
representation. To address this problem, this paper presents a novel fully
point-wise CNN architecture for modeling statistical regularities in natural
images. Specifically, we propose to randomly shuffle the pixels in the origin
images and leverage the shuffled image as input to make CNN more concerned with
the statistical properties. Moreover, since the pixels in the shuffled image
are independent identically distributed, we can replace all the large
convolution kernels in CNN with point-wise () convolution kernels while
maintaining the representation ability. Experimental results on two
applications: color constancy and image dehazing, demonstrate the superiority
of our proposed network over the existing architectures, i.e., using
1/101/100 network parameters and computational cost while achieving
comparable performance.Comment: 9 pages, 7 figures. To appear in ACM MM 201
Rich Feature Distillation with Feature Affinity Module for Efficient Image Dehazing
Single-image haze removal is a long-standing hurdle for computer vision
applications. Several works have been focused on transferring advances from
image classification, detection, and segmentation to the niche of image
dehazing, primarily focusing on contrastive learning and knowledge
distillation. However, these approaches prove computationally expensive,
raising concern regarding their applicability to on-the-edge use-cases. This
work introduces a simple, lightweight, and efficient framework for single-image
haze removal, exploiting rich "dark-knowledge" information from a lightweight
pre-trained super-resolution model via the notion of heterogeneous knowledge
distillation. We designed a feature affinity module to maximize the flow of
rich feature semantics from the super-resolution teacher to the student
dehazing network. In order to evaluate the efficacy of our proposed framework,
its performance as a plug-and-play setup to a baseline model is examined. Our
experiments are carried out on the RESIDE-Standard dataset to demonstrate the
robustness of our framework to the synthetic and real-world domains. The
extensive qualitative and quantitative results provided establish the
effectiveness of the framework, achieving gains of upto 15\% (PSNR) while
reducing the model size by 20 times.Comment: Preprint version. Accepted at Opti
All-in-one aerial image enhancement network for forest scenes
Drone monitoring plays an irreplaceable and significant role in forest firefighting due to its characteristics of wide-range observation and real-time messaging. However, aerial images are often susceptible to different degradation problems before performing high-level visual tasks including but not limited to smoke detection, fire classification, and regional localization. Recently, the majority of image enhancement methods are centered around particular types of degradation, necessitating the memory unit to accommodate different models for distinct scenarios in practical applications. Furthermore, such a paradigm requires wasted computational and storage resources to determine the type of degradation, making it difficult to meet the real-time and lightweight requirements of real-world scenarios. In this paper, we propose an All-in-one Image Enhancement Network (AIENet) that can restore various degraded images in one network. Specifically, we design a new multi-scale receptive field image enhancement block, which can better reconstruct high-resolution details of target regions of different sizes. In particular, this plug-and-play module enables it to be embedded in any learning-based model. And it has better flexibility and generalization in practical applications. This paper takes three challenging image enhancement tasks encountered in drone monitoring as examples, whereby we conduct task-specific and all-in-one image enhancement experiments on a synthetic forest dataset. The results show that the proposed AIENet outperforms the state-of-the-art image enhancement algorithms quantitatively and qualitatively. Furthermore, extra experiments on high-level vision detection also show the promising performance of our method compared with some recent baselines.Award-winningPostprint (published version
GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions
Image restoration in adverse weather conditions is a difficult task in
computer vision. In this paper, we propose a novel transformer-based framework
called GridFormer which serves as a backbone for image restoration under
adverse weather conditions. GridFormer is designed in a grid structure using a
residual dense transformer block, and it introduces two core designs. First, it
uses an enhanced attention mechanism in the transformer layer. The mechanism
includes stages of the sampler and compact self-attention to improve
efficiency, and a local enhancement stage to strengthen local information.
Second, we introduce a residual dense transformer block (RDTB) as the final
GridFormer layer. This design further improves the network's ability to learn
effective features from both preceding and current local features. The
GridFormer framework achieves state-of-the-art results on five diverse image
restoration tasks in adverse weather conditions, including image deraining,
dehazing, deraining & dehazing, desnowing, and multi-weather restoration. The
source code and pre-trained models will be released.Comment: 17 pages, 12 figure
- …