7,566 research outputs found
Abnormal Event Detection in Videos using Spatiotemporal Autoencoder
We present an efficient method for detecting anomalies in videos. Recent
applications of convolutional neural networks have shown promises of
convolutional layers for object detection and recognition, especially in
images. However, convolutional neural networks are supervised and require
labels as learning signals. We propose a spatiotemporal architecture for
anomaly detection in videos including crowded scenes. Our architecture includes
two main components, one for spatial feature representation, and one for
learning the temporal evolution of the spatial features. Experimental results
on Avenue, Subway and UCSD benchmarks confirm that the detection accuracy of
our method is comparable to state-of-the-art methods at a considerable speed of
up to 140 fps
Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition
Recently, Long Short-Term Memory (LSTM) has become a popular choice to model
individual dynamics for single-person action recognition due to its ability of
modeling the temporal information in various ranges of dynamic contexts.
However, existing RNN models only focus on capturing the temporal dynamics of
the person-person interactions by naively combining the activity dynamics of
individuals or modeling them as a whole. This neglects the inter-related
dynamics of how person-person interactions change over time. To this end, we
propose a novel Concurrence-Aware Long Short-Term Sub-Memories (Co-LSTSM) to
model the long-term inter-related dynamics between two interacting people on
the bounding boxes covering people. Specifically, for each frame, two
sub-memory units store individual motion information, while a concurrent LSTM
unit selectively integrates and stores inter-related motion information between
interacting people from these two sub-memory units via a new co-memory cell.
Experimental results on the BIT and UT datasets show the superiority of
Co-LSTSM compared with the state-of-the-art methods
An original framework for understanding human actions and body language by using deep neural networks
The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour.
By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way.
These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively.
While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements;
both are essential tasks in many computer vision applications, including event recognition, and video surveillance.
In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided.
The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements.
All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
- …