50 research outputs found
Holistic Attention-Fusion Adversarial Network for Single Image Defogging
Adversarial learning-based image defogging methods have been extensively
studied in computer vision due to their remarkable performance. However, most
existing methods have limited defogging capabilities for real cases because
they are trained on the paired clear and synthesized foggy images of the same
scenes. In addition, they have limitations in preserving vivid color and rich
textual details in defogging. To address these issues, we develop a novel
generative adversarial network, called holistic attention-fusion adversarial
network (HAAN), for single image defogging. HAAN consists of a Fog2Fogfree
block and a Fogfree2Fog block. In each block, there are three learning-based
modules, namely, fog removal, color-texture recovery, and fog synthetic, that
are constrained each other to generate high quality images. HAAN is designed to
exploit the self-similarity of texture and structure information by learning
the holistic channel-spatial feature correlations between the foggy image with
its several derived images. Moreover, in the fog synthetic module, we utilize
the atmospheric scattering model to guide it to improve the generative quality
by focusing on an atmospheric light optimization with a novel sky segmentation
network. Extensive experiments on both synthetic and real-world datasets show
that HAAN outperforms state-of-the-art defogging methods in terms of
quantitative accuracy and subjective visual quality.Comment: 13 pages, 10 figure
All-in-one aerial image enhancement network for forest scenes
Drone monitoring plays an irreplaceable and significant role in forest firefighting due to its characteristics of wide-range observation and real-time messaging. However, aerial images are often susceptible to different degradation problems before performing high-level visual tasks including but not limited to smoke detection, fire classification, and regional localization. Recently, the majority of image enhancement methods are centered around particular types of degradation, necessitating the memory unit to accommodate different models for distinct scenarios in practical applications. Furthermore, such a paradigm requires wasted computational and storage resources to determine the type of degradation, making it difficult to meet the real-time and lightweight requirements of real-world scenarios. In this paper, we propose an All-in-one Image Enhancement Network (AIENet) that can restore various degraded images in one network. Specifically, we design a new multi-scale receptive field image enhancement block, which can better reconstruct high-resolution details of target regions of different sizes. In particular, this plug-and-play module enables it to be embedded in any learning-based model. And it has better flexibility and generalization in practical applications. This paper takes three challenging image enhancement tasks encountered in drone monitoring as examples, whereby we conduct task-specific and all-in-one image enhancement experiments on a synthetic forest dataset. The results show that the proposed AIENet outperforms the state-of-the-art image enhancement algorithms quantitatively and qualitatively. Furthermore, extra experiments on high-level vision detection also show the promising performance of our method compared with some recent baselines.Award-winningPostprint (published version
DADFNet: Dual Attention and Dual Frequency-Guided Dehazing Network for Video-Empowered Intelligent Transportation
Visual surveillance technology is an indispensable functional component of
advanced traffic management systems. It has been applied to perform traffic
supervision tasks, such as object detection, tracking and recognition. However,
adverse weather conditions, e.g., fog, haze and mist, pose severe challenges
for video-based transportation surveillance. To eliminate the influences of
adverse weather conditions, we propose a dual attention and dual
frequency-guided dehazing network (termed DADFNet) for real-time visibility
enhancement. It consists of a dual attention module (DAM) and a high-low
frequency-guided sub-net (HLFN) to jointly consider the attention and frequency
mapping to guide haze-free scene reconstruction. Extensive experiments on both
synthetic and real-world images demonstrate the superiority of DADFNet over
state-of-the-art methods in terms of visibility enhancement and improvement in
detection accuracy. Furthermore, DADFNet only takes ms to process a 1,920
* 1,080 image on the 2080 Ti GPU, making it highly efficient for deployment in
intelligent transportation systems.Comment: This paper is accepted by AAAI 2022 Workshop: AI for Transportatio
DEEP LEARNING FOR IMAGE RESTORATION AND ROBOTIC VISION
Traditional model-based approach requires the formulation of mathematical model, and the model often has limited performance. The quality of an image may degrade due to a variety of reasons: It could be the context of scene is affected by weather conditions such as haze, rain, and snow; It\u27s also possible that there is some noise generated during image processing/transmission (e.g., artifacts generated during compression.). The goal of image restoration is to restore the image back to desirable quality both subjectively and objectively. Agricultural robotics is gaining interest these days since most agricultural works are lengthy and repetitive. Computer vision is crucial to robots especially the autonomous ones. However, it is challenging to have a precise mathematical model to describe the aforementioned problems. Compared with traditional approach, learning-based approach has an edge since it does not require any model to describe the problem. Moreover, learning-based approach now has the best-in-class performance on most of the vision problems such as image dehazing, super-resolution, and image recognition.
In this dissertation, we address the problem of image restoration and robotic vision with deep learning. These two problems are highly related with each other from a unique network architecture perspective: It is essential to select appropriate networks when dealing with different problems. Specifically, we solve the problems of single image dehazing, High Efficiency Video Coding (HEVC) loop filtering and super-resolution, and computer vision for an autonomous robot. Our technical contributions are threefold: First, we propose to reformulate haze as a signal-dependent noise which allows us to uncover it by learning a structural residual. Based on our novel reformulation, we solve dehazing with recursive deep residual network and generative adversarial network which emphasizes on objective and perceptual quality, respectively. Second, we replace traditional filters in HEVC with a Convolutional Neural Network (CNN) filter. We show that our CNN filter could achieve 7% BD-rate saving when compared with traditional filters such as bilateral and deblocking filter. We also propose to incorporate a multi-scale CNN super-resolution module into HEVC. Such post-processing module could improve visual quality under extremely low bandwidth. Third, a transfer learning technique is implemented to support vision and autonomous decision making of a precision pollination robot. Good experimental results are reported with real-world data