3,020 research outputs found
Abnormal Event Detection in Videos using Spatiotemporal Autoencoder
We present an efficient method for detecting anomalies in videos. Recent
applications of convolutional neural networks have shown promises of
convolutional layers for object detection and recognition, especially in
images. However, convolutional neural networks are supervised and require
labels as learning signals. We propose a spatiotemporal architecture for
anomaly detection in videos including crowded scenes. Our architecture includes
two main components, one for spatial feature representation, and one for
learning the temporal evolution of the spatial features. Experimental results
on Avenue, Subway and UCSD benchmarks confirm that the detection accuracy of
our method is comparable to state-of-the-art methods at a considerable speed of
up to 140 fps
Anomaly Detection in Aerial Videos with Transformers
Unmanned aerial vehicles (UAVs) are widely applied for purposes of
inspection, search, and rescue operations by the virtue of low-cost,
large-coverage, real-time, and high-resolution data acquisition capacities.
Massive volumes of aerial videos are produced in these processes, in which
normal events often account for an overwhelming proportion. It is extremely
difficult to localize and extract abnormal events containing potentially
valuable information from long video streams manually. Therefore, we are
dedicated to developing anomaly detection methods to solve this issue. In this
paper, we create a new dataset, named DroneAnomaly, for anomaly detection in
aerial videos. This dataset provides 37 training video sequences and 22 testing
video sequences from 7 different realistic scenes with various anomalous
events. There are 87,488 color video frames (51,635 for training and 35,853 for
testing) with the size of at 30 frames per second. Based on
this dataset, we evaluate existing methods and offer a benchmark for this task.
Furthermore, we present a new baseline model, ANomaly Detection with
Transformers (ANDT), which treats consecutive video frames as a sequence of
tubelets, utilizes a Transformer encoder to learn feature representations from
the sequence, and leverages a decoder to predict the next frame. Our network
models normality in the training phase and identifies an event with
unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To
comprehensively evaluate the performance of our proposed method, we use not
only our Drone-Anomaly dataset but also another dataset. We will make our
dataset and code publicly available. A demo video is available at
https://youtu.be/ancczYryOBY. We make our dataset and code publicly available
- …