5,105 research outputs found
Project RISE: Recognizing Industrial Smoke Emissions
Industrial smoke emissions pose a significant concern to human health. Prior
works have shown that using Computer Vision (CV) techniques to identify smoke
as visual evidence can influence the attitude of regulators and empower
citizens to pursue environmental justice. However, existing datasets are not of
sufficient quality nor quantity to train the robust CV models needed to support
air quality advocacy. We introduce RISE, the first large-scale video dataset
for Recognizing Industrial Smoke Emissions. We adopted a citizen science
approach to collaborate with local community members to annotate whether a
video clip has smoke emissions. Our dataset contains 12,567 clips from 19
distinct views from cameras that monitored three industrial facilities. These
daytime clips span 30 days over two years, including all four seasons. We ran
experiments using deep neural networks to establish a strong performance
baseline and reveal smoke recognition challenges. Our survey study discussed
community feedback, and our data analysis displayed opportunities for
integrating citizen scientists and crowd workers into the application of
Artificial Intelligence for social good.Comment: Technical repor
Recurrent 3D Pose Sequence Machines
3D human articulated pose recovery from monocular image sequences is very
challenging due to the diverse appearances, viewpoints, occlusions, and also
the human 3D pose is inherently ambiguous from the monocular imagery. It is
thus critical to exploit rich spatial and temporal long-range dependencies
among body joints for accurate 3D pose sequence prediction. Existing approaches
usually manually design some elaborate prior terms and human body kinematic
constraints for capturing structures, which are often insufficient to exploit
all intrinsic structures and not scalable for all scenarios. In contrast, this
paper presents a Recurrent 3D Pose Sequence Machine(RPSM) to automatically
learn the image-dependent structural constraint and sequence-dependent temporal
context by using a multi-stage sequential refinement. At each stage, our RPSM
is composed of three modules to predict the 3D pose sequences based on the
previously learned 2D pose representations and 3D poses: (i) a 2D pose module
extracting the image-dependent pose representations, (ii) a 3D pose recurrent
module regressing 3D poses and (iii) a feature adaption module serving as a
bridge between module (i) and (ii) to enable the representation transformation
from 2D to 3D domain. These three modules are then assembled into a sequential
prediction framework to refine the predicted poses with multiple recurrent
stages. Extensive evaluations on the Human3.6M dataset and HumanEva-I dataset
show that our RPSM outperforms all state-of-the-art approaches for 3D pose
estimation.Comment: Published in CVPR 201
Semi-supervised wildfire smoke detection based on smoke-aware consistency
The semi-transparency property of smoke integrates it highly with the background contextual information in the image, which results in great visual differences in different areas. In addition, the limited annotation of smoke images from real forest scenarios brings more challenges for model training. In this paper, we design a semi-supervised learning strategy, named smokeaware consistency (SAC), to maintain pixel and context perceptual consistency in different backgrounds. Furthermore, we propose a smoke detection strategy with triple classification assistance for smoke and smoke-like object discrimination. Finally, we simplified the LFNet fire-smoke detection network to LFNet-v2, due to the proposed SAC and triple classification assistance that can perform the functions of some specific module. The extensive experiments validate that the proposed method significantly outperforms state-of-the-art object detection algorithms on wildfire smoke datasets and achieves satisfactory performance under challenging weather conditions.Peer ReviewedPostprint (published version
All-in-one aerial image enhancement network for forest scenes
Drone monitoring plays an irreplaceable and significant role in forest firefighting due to its characteristics of wide-range observation and real-time messaging. However, aerial images are often susceptible to different degradation problems before performing high-level visual tasks including but not limited to smoke detection, fire classification, and regional localization. Recently, the majority of image enhancement methods are centered around particular types of degradation, necessitating the memory unit to accommodate different models for distinct scenarios in practical applications. Furthermore, such a paradigm requires wasted computational and storage resources to determine the type of degradation, making it difficult to meet the real-time and lightweight requirements of real-world scenarios. In this paper, we propose an All-in-one Image Enhancement Network (AIENet) that can restore various degraded images in one network. Specifically, we design a new multi-scale receptive field image enhancement block, which can better reconstruct high-resolution details of target regions of different sizes. In particular, this plug-and-play module enables it to be embedded in any learning-based model. And it has better flexibility and generalization in practical applications. This paper takes three challenging image enhancement tasks encountered in drone monitoring as examples, whereby we conduct task-specific and all-in-one image enhancement experiments on a synthetic forest dataset. The results show that the proposed AIENet outperforms the state-of-the-art image enhancement algorithms quantitatively and qualitatively. Furthermore, extra experiments on high-level vision detection also show the promising performance of our method compared with some recent baselines.Award-winningPostprint (published version
Registration-Free Hybrid Learning Empowers Simple Multimodal Imaging System for High-quality Fusion Detection
Multimodal fusion detection always places high demands on the imaging system
and image pre-processing, while either a high-quality pre-registration system
or image registration processing is costly. Unfortunately, the existing fusion
methods are designed for registered source images, and the fusion of
inhomogeneous features, which denotes a pair of features at the same spatial
location that expresses different semantic information, cannot achieve
satisfactory performance via these methods. As a result, we propose IA-VFDnet,
a CNN-Transformer hybrid learning framework with a unified high-quality
multimodal feature matching module (AKM) and a fusion module (WDAF), in which
AKM and DWDAF work in synergy to perform high-quality infrared-aware visible
fusion detection, which can be applied to smoke and wildfire detection.
Furthermore, experiments on the M3FD dataset validate the superiority of the
proposed method, with IA-VFDnet achieving the best detection performance than
other state-of-the-art methods under conventional registered conditions. In
addition, the first unregistered multimodal smoke and wildfire detection
benchmark is openly available in this letter
Fire detection using deep learning methods
Fire detection is an important task in the field of safety and emergency prevention. In recent years, deep learning methods have shown high efficiency in solving various computer vision problems, including detecting objects in images. In this paper, monitoring wildfires was considered, which allows you to quickly respond to them and prevent their spread using deep learning methods. For the experiment, images from the satellite and images from the FireWatch sensor were taken as initial data. In this work, the deep learning algorithms you only look once (YOLO), convolutional neural network (CNN), and fast recurrent neural network (FastRNN) were considered, which makes it possible to determine the accuracy of a natural fire. As a result of the experiments, an automated fire recognition algorithm using YOLOv4 deep learning methods was created. It is expected that the results of the study will show that deep learning methods can be successfully applied to detect fire in images. This may lead to the development of automated monitoring systems capable of quickly and reliably detecting fire situations, which will help improve safety and reduce the risk of fires
- …