13,547 research outputs found
Asynchronous Interaction Aggregation for Action Detection
Understanding interaction is an essential part of video action detection. We
propose the Asynchronous Interaction Aggregation network (AIA) that leverages
different interactions to boost action detection. There are two key designs in
it: one is the Interaction Aggregation structure (IA) adopting a uniform
paradigm to model and integrate multiple types of interaction; the other is the
Asynchronous Memory Update algorithm (AMU) that enables us to achieve better
performance by modeling very long-term interaction dynamically without huge
computation cost. We provide empirical evidence to show that our network can
gain notable accuracy from the integrative interactions and is easy to train
end-to-end. Our method reports the new state-of-the-art performance on AVA
dataset, with 3.7 mAP gain (12.6% relative improvement) on validation split
comparing to our strong baseline. The results on dataset UCF101-24 and
EPIC-Kitchens further illustrate the effectiveness of our approach. Source code
will be made public at: https://github.com/MVIG-SJTU/AlphAction
Project RISE: Recognizing Industrial Smoke Emissions
Industrial smoke emissions pose a significant concern to human health. Prior
works have shown that using Computer Vision (CV) techniques to identify smoke
as visual evidence can influence the attitude of regulators and empower
citizens to pursue environmental justice. However, existing datasets are not of
sufficient quality nor quantity to train the robust CV models needed to support
air quality advocacy. We introduce RISE, the first large-scale video dataset
for Recognizing Industrial Smoke Emissions. We adopted a citizen science
approach to collaborate with local community members to annotate whether a
video clip has smoke emissions. Our dataset contains 12,567 clips from 19
distinct views from cameras that monitored three industrial facilities. These
daytime clips span 30 days over two years, including all four seasons. We ran
experiments using deep neural networks to establish a strong performance
baseline and reveal smoke recognition challenges. Our survey study discussed
community feedback, and our data analysis displayed opportunities for
integrating citizen scientists and crowd workers into the application of
Artificial Intelligence for social good.Comment: Technical repor
- …