43 research outputs found
A Survey of Deep Learning Solutions for Anomaly Detection in Surveillance Videos
Deep learning has proven to be a landmark computing approach to the computer vision domain. Hence, it has been widely applied to solve complex cognitive tasks like the detection of anomalies in surveillance videos. Anomaly detection in this case is the identification of abnormal events in the surveillance videos which can be deemed as security incidents or threats. Deep learning solutions for anomaly detection has outperformed other traditional machine learning solutions. This review attempts to provide holistic benchmarking of the published deep learning solutions for videos anomaly detection since 2016. The paper identifies, the learning technique, datasets used and the overall model accuracy. Reviewed papers were organised into five deep learning methods namely; autoencoders, continual learning, transfer learning, reinforcement learning and ensemble learning. Current and emerging trends are discussed as well
Towards Visually Explaining Variational Autoencoders
Recent advances in Convolutional Neural Network (CNN) model interpretability
have led to impressive progress in visualizing and understanding model
predictions. In particular, gradient-based visual attention methods have driven
much recent effort in using visual attention maps as a means for visual
explanations. A key problem, however, is these methods are designed for
classification and categorization tasks, and their extension to explaining
generative models, e.g. variational autoencoders (VAE) is not trivial. In this
work, we take a step towards bridging this crucial gap, proposing the first
technique to visually explain VAEs by means of gradient-based attention. We
present methods to generate visual attention from the learned latent space, and
also demonstrate such attention explanations serve more than just explaining
VAE predictions. We show how these attention maps can be used to localize
anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD
dataset. We also show how they can be infused into model training, helping
bootstrap the VAE into learning improved latent space disentanglement,
demonstrated on the Dsprites dataset
Reducing Redundancy in the Bottleneck Representation of the Autoencoders
Autoencoders are a type of unsupervised neural networks, which can be used to
solve various tasks, e.g., dimensionality reduction, image compression, and
image denoising. An AE has two goals: (i) compress the original input to a
low-dimensional space at the bottleneck of the network topology using an
encoder, (ii) reconstruct the input from the representation at the bottleneck
using a decoder. Both encoder and decoder are optimized jointly by minimizing a
distortion-based loss which implicitly forces the model to keep only those
variations of input data that are required to reconstruct the and to reduce
redundancies. In this paper, we propose a scheme to explicitly penalize feature
redundancies in the bottleneck representation. To this end, we propose an
additional loss term, based on the pair-wise correlation of the neurons, which
complements the standard reconstruction loss forcing the encoder to learn a
more diverse and richer representation of the input. We tested our approach
across different tasks: dimensionality reduction using three different dataset,
image compression using the MNIST dataset, and image denoising using fashion
MNIST. The experimental results show that the proposed loss leads consistently
to superior performance compared to the standard AE loss.Comment: 6 pages,4 figures. The paper is under consideration at Pattern
Recognition Letter
Pattern Anomaly Detection based on Sequence-to-Sequence Regularity Learning
Anomaly detection in traffic surveillance videos is a challenging task due to the ambiguity of anomaly definition and the complexity of scenes. In this paper, we propose to detect anomalous trajectories for vehicle behavior analysis via learning regularities in data. First, we train a sequence-to-sequence model under the autoencoder architecture and propose a new reconstruction error function for model optimization and anomaly evaluation. As such, the model is forced to learn the regular trajectory patterns in an unsupervised manner. Then, at the inference stage, we use the learned model to encode the test trajectory sample into a compact representation and generate a new trajectory sequence in the learned regular pattern. An anomaly score is computed based on the deviation of the generated trajectory from the test sample. Finally, we can find out the anomalous trajectories with an adaptive threshold. We evaluate the proposed method on two real-world traffic datasets and the experiments show favorable results against state-of-the-art algorithms. This paper\u27s research on sequence-to-sequence regularity learning can provide theoretical and practical support for pattern anomaly detection
Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles
Self-supervised tasks such as colorization, inpainting and zigsaw puzzle have
been utilized for visual representation learning for still images, when the
number of labeled images is limited or absent at all. Recently, this worthwhile
stream of study extends to video domain where the cost of human labeling is
even more expensive. However, the most of existing methods are still based on
2D CNN architectures that can not directly capture spatio-temporal information
for video applications. In this paper, we introduce a new self-supervised task
called as \textit{Space-Time Cubic Puzzles} to train 3D CNNs using large scale
video dataset. This task requires a network to arrange permuted 3D
spatio-temporal crops. By completing \textit{Space-Time Cubic Puzzles}, the
network learns both spatial appearance and temporal relation of video frames,
which is our final goal. In experiments, we demonstrate that our learned 3D
representation is well transferred to action recognition tasks, and outperforms
state-of-the-art 2D CNN-based competitors on UCF101 and HMDB51 datasets.Comment: Accepted to AAAI 201