Learning to Detect Violent Videos using Convolutional Long Short-Term
  Memory

Lanz, Oswald; Sudhakaran, Swathikiran

research

Learning to Detect Violent Videos using Convolutional Long Short-Term Memory

Authors: Oswald Lanz
Swathikiran Sudhakaran
Publication date: 19 September 2017
Publisher
Doi

Abstract

Developing a technique for the automatic analysis of surveillance videos in order to identify the presence of violence is of broad interest. In this work, we propose a deep neural network for the purpose of recognizing violent videos. A convolutional neural network is used to extract frame level features from a video. The frame level features are then aggregated using a variant of the long short term memory that uses convolutional gates. The convolutional neural network along with the convolutional long short term memory is capable of capturing localized spatio-temporal features which enables the analysis of local motion taking place in the video. We also propose to use adjacent frame differences as the input to the model thereby forcing it to encode the changes occurring in the video. The performance of the proposed feature extraction pipeline is evaluated on three standard benchmark datasets in terms of recognition accuracy. Comparison of the results obtained with the state of the art techniques revealed the promising capability of the proposed method in recognizing violent videos.Comment: Accepted in International Conference on Advanced Video and Signal based Surveillance(AVSS 2017

Similar works

Full text

Available Versions

Crossref

info:doi/10.1109%2Favss.2017.8...

Last time updated on 03/01/2020

Archivio della ricerca - Fondazione Bruno Kessler

oai:cris.fbk.eu:11582/310261

Last time updated on 03/09/2019