163 research outputs found

    Accurate and lightweight dehazing via multi-receptive-field non-local network and novel contrastive regularization

    Full text link
    Recently, deep learning-based methods have dominated image dehazing domain. Although very competitive dehazing performance has been achieved with sophisticated models, effective solutions for extracting useful features are still under-explored. In addition, non-local network, which has made a breakthrough in many vision tasks, has not been appropriately applied to image dehazing. Thus, a multi-receptive-field non-local network (MRFNLN) consisting of the multi-stream feature attention block (MSFAB) and cross non-local block (CNLB) is presented in this paper. We start with extracting richer features for dehazing. Specifically, we design a multi-stream feature extraction (MSFE) sub-block, which contains three parallel convolutions with different receptive fields (i.e., 1×11\times 1, 3×33\times 3, 5×55\times 5) for extracting multi-scale features. Following MSFE, we employ an attention sub-block to make the model adaptively focus on important channels/regions. The MSFE and attention sub-blocks constitute our MSFAB. Then, we design a cross non-local block (CNLB), which can capture long-range dependencies beyond the query. Instead of the same input source of query branch, the key and value branches are enhanced by fusing more preceding features. CNLB is computation-friendly by leveraging a spatial pyramid down-sampling (SPDS) strategy to reduce the computation and memory consumption without sacrificing the performance. Last but not least, a novel detail-focused contrastive regularization (DFCR) is presented by emphasizing the low-level details and ignoring the high-level semantic information in the representation space. Comprehensive experimental results demonstrate that the proposed MRFNLN model outperforms recent state-of-the-art dehazing methods with less than 1.5 Million parameters.Comment: submitted to IEEE TCYB for possible publicatio

    MixDehazeNet : Mix Structure Block For Image Dehazing Network

    Full text link
    Image dehazing is a typical task in the low-level vision field. Previous studies verified the effectiveness of the large convolutional kernel and attention mechanism in dehazing. However, there are two drawbacks: the multi-scale properties of an image are readily ignored when a large convolutional kernel is introduced, and the standard series connection of an attention module does not sufficiently consider an uneven hazy distribution. In this paper, we propose a novel framework named Mix Structure Image Dehazing Network (MixDehazeNet), which solves two issues mentioned above. Specifically, it mainly consists of two parts: the multi-scale parallel large convolution kernel module and the enhanced parallel attention module. Compared with a single large kernel, parallel large kernels with multi-scale are more capable of taking partial texture into account during the dehazing phase. In addition, an enhanced parallel attention module is developed, in which parallel connections of attention perform better at dehazing uneven hazy distribution. Extensive experiments on three benchmarks demonstrate the effectiveness of our proposed methods. For example, compared with the previous state-of-the-art methods, MixDehazeNet achieves a significant improvement (42.62dB PSNR) on the SOTS indoor dataset. The code is released in https://github.com/AmeryXiong/MixDehazeNet

    Restoring Vision in Hazy Weather with Hierarchical Contrastive Learning

    Full text link
    Image restoration under hazy weather condition, which is called single image dehazing, has been of significant interest for various computer vision applications. In recent years, deep learning-based methods have achieved success. However, existing image dehazing methods typically neglect the hierarchy of features in the neural network and fail to exploit their relationships fully. To this end, we propose an effective image dehazing method named Hierarchical Contrastive Dehazing (HCD), which is based on feature fusion and contrastive learning strategies. HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL). Specifically, the core design in the HDN is a hierarchical interaction module, which utilizes multi-scale activation to revise the feature responses hierarchically. To cooperate with the training of HDN, we propose HCL which performs contrastive learning on hierarchically paired exemplars, facilitating haze removal. Extensive experiments on public datasets, RESIDE, HazeRD, and DENSE-HAZE, demonstrate that HCD quantitatively outperforms the state-of-the-art methods in terms of PSNR, SSIM and achieves better visual quality.Comment: 30 pages, 10 figure

    Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images

    Full text link
    Modeling statistical regularity plays an essential role in ill-posed image processing problems. Recently, deep learning based methods have been presented to implicitly learn statistical representation of pixel distributions in natural images and leverage it as a constraint to facilitate subsequent tasks, such as color constancy and image dehazing. However, the existing CNN architecture is prone to variability and diversity of pixel intensity within and between local regions, which may result in inaccurate statistical representation. To address this problem, this paper presents a novel fully point-wise CNN architecture for modeling statistical regularities in natural images. Specifically, we propose to randomly shuffle the pixels in the origin images and leverage the shuffled image as input to make CNN more concerned with the statistical properties. Moreover, since the pixels in the shuffled image are independent identically distributed, we can replace all the large convolution kernels in CNN with point-wise (1∗11*1) convolution kernels while maintaining the representation ability. Experimental results on two applications: color constancy and image dehazing, demonstrate the superiority of our proposed network over the existing architectures, i.e., using 1/10∼\sim1/100 network parameters and computational cost while achieving comparable performance.Comment: 9 pages, 7 figures. To appear in ACM MM 201

    Rich Feature Distillation with Feature Affinity Module for Efficient Image Dehazing

    Full text link
    Single-image haze removal is a long-standing hurdle for computer vision applications. Several works have been focused on transferring advances from image classification, detection, and segmentation to the niche of image dehazing, primarily focusing on contrastive learning and knowledge distillation. However, these approaches prove computationally expensive, raising concern regarding their applicability to on-the-edge use-cases. This work introduces a simple, lightweight, and efficient framework for single-image haze removal, exploiting rich "dark-knowledge" information from a lightweight pre-trained super-resolution model via the notion of heterogeneous knowledge distillation. We designed a feature affinity module to maximize the flow of rich feature semantics from the super-resolution teacher to the student dehazing network. In order to evaluate the efficacy of our proposed framework, its performance as a plug-and-play setup to a baseline model is examined. Our experiments are carried out on the RESIDE-Standard dataset to demonstrate the robustness of our framework to the synthetic and real-world domains. The extensive qualitative and quantitative results provided establish the effectiveness of the framework, achieving gains of upto 15\% (PSNR) while reducing the model size by ∼\sim20 times.Comment: Preprint version. Accepted at Opti

    All-in-one aerial image enhancement network for forest scenes

    Get PDF
    Drone monitoring plays an irreplaceable and significant role in forest firefighting due to its characteristics of wide-range observation and real-time messaging. However, aerial images are often susceptible to different degradation problems before performing high-level visual tasks including but not limited to smoke detection, fire classification, and regional localization. Recently, the majority of image enhancement methods are centered around particular types of degradation, necessitating the memory unit to accommodate different models for distinct scenarios in practical applications. Furthermore, such a paradigm requires wasted computational and storage resources to determine the type of degradation, making it difficult to meet the real-time and lightweight requirements of real-world scenarios. In this paper, we propose an All-in-one Image Enhancement Network (AIENet) that can restore various degraded images in one network. Specifically, we design a new multi-scale receptive field image enhancement block, which can better reconstruct high-resolution details of target regions of different sizes. In particular, this plug-and-play module enables it to be embedded in any learning-based model. And it has better flexibility and generalization in practical applications. This paper takes three challenging image enhancement tasks encountered in drone monitoring as examples, whereby we conduct task-specific and all-in-one image enhancement experiments on a synthetic forest dataset. The results show that the proposed AIENet outperforms the state-of-the-art image enhancement algorithms quantitatively and qualitatively. Furthermore, extra experiments on high-level vision detection also show the promising performance of our method compared with some recent baselines.Award-winningPostprint (published version

    GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

    Full text link
    Image restoration in adverse weather conditions is a difficult task in computer vision. In this paper, we propose a novel transformer-based framework called GridFormer which serves as a backbone for image restoration under adverse weather conditions. GridFormer is designed in a grid structure using a residual dense transformer block, and it introduces two core designs. First, it uses an enhanced attention mechanism in the transformer layer. The mechanism includes stages of the sampler and compact self-attention to improve efficiency, and a local enhancement stage to strengthen local information. Second, we introduce a residual dense transformer block (RDTB) as the final GridFormer layer. This design further improves the network's ability to learn effective features from both preceding and current local features. The GridFormer framework achieves state-of-the-art results on five diverse image restoration tasks in adverse weather conditions, including image deraining, dehazing, deraining & dehazing, desnowing, and multi-weather restoration. The source code and pre-trained models will be released.Comment: 17 pages, 12 figure
    • …
    corecore