57,765 research outputs found
Learning to Detect Violent Videos using Convolutional Long Short-Term Memory
Developing a technique for the automatic analysis of surveillance videos in
order to identify the presence of violence is of broad interest. In this work,
we propose a deep neural network for the purpose of recognizing violent videos.
A convolutional neural network is used to extract frame level features from a
video. The frame level features are then aggregated using a variant of the long
short term memory that uses convolutional gates. The convolutional neural
network along with the convolutional long short term memory is capable of
capturing localized spatio-temporal features which enables the analysis of
local motion taking place in the video. We also propose to use adjacent frame
differences as the input to the model thereby forcing it to encode the changes
occurring in the video. The performance of the proposed feature extraction
pipeline is evaluated on three standard benchmark datasets in terms of
recognition accuracy. Comparison of the results obtained with the state of the
art techniques revealed the promising capability of the proposed method in
recognizing violent videos.Comment: Accepted in International Conference on Advanced Video and Signal
based Surveillance(AVSS 2017
Recommended from our members
Methods and systems for extracting venous pulsation and respiratory information from photoplethysmographs
A system and method for separating a venous component and an arterial component from a red signal and an infrared signal of a PPG sensor is provided. The method uses the second order statistics of venous and arterial signals to separate the venous andarterial signals. After reliable separation of the venous and thearterial component signals,the component signals can be used for different purposes. In a preferred embodiment, the respiratory signal, pattern, and rate are extracted from the separated venous component and a reliable ?ratio of ratios? is extracted for SpO, using only the arterial component of the PPG signals. The disclosed embodiments enable real-time continuous monitoring of respiration pattern/rate and site-independentarterial oxygen saturation.Board of Regents, University of Texas Syste
Splicing of concurrent upper-body motion spaces with locomotion
In this paper, we present a motion splicing technique for generating concurrent upper-body actions occurring simultaneously with the evolution of a lower-body locomotion sequence. Specifically, we show that a layered interpolation motion model generates upper-body poses while assigning different actions to each upper-body part. Hence, in the proposed motion splicing approach, it is possible to increase the number of generated motions as well as the number of desired actions that can be performed by virtual characters. Additionally, we propose an iterative motion blending solution, inverse pseudo-blending, to maintain a smooth and natural interaction between the virtual character and the virtual environment; inverse pseudo-blending is a constraint-based motion editing technique that blends the motions enclosed in a tetrahedron by minimising the distances between the end-effector positions of the actual and blended motions. Additionally, to evaluate the proposed solution, we implemented an example-based application for interactive motion splicing based on specified constraints. Finally, the generated results show that the proposed solution can be beneficially applied to interactive applications where concurrent actions of the upper-body are desired
Indirect Match Highlights Detection with Deep Convolutional Neural Networks
Highlights in a sport video are usually referred as actions that stimulate
excitement or attract attention of the audience. A big effort is spent in
designing techniques which find automatically highlights, in order to
automatize the otherwise manual editing process. Most of the state-of-the-art
approaches try to solve the problem by training a classifier using the
information extracted on the tv-like framing of players playing on the game
pitch, learning to detect game actions which are labeled by human observers
according to their perception of highlight. Obviously, this is a long and
expensive work. In this paper, we reverse the paradigm: instead of looking at
the gameplay, inferring what could be exciting for the audience, we directly
analyze the audience behavior, which we assume is triggered by events happening
during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to
extract visual features from cropped video recordings of the supporters that
are attending the event. Outputs of the crops belonging to the same frame are
then accumulated to produce a value indicating the Highlight Likelihood (HL)
which is then used to discriminate between positive (i.e. when a highlight
occurs) and negative samples (i.e. standard play or time-outs). Experimental
results on a public dataset of ice-hockey matches demonstrate the effectiveness
of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with
ICIAP 201
- …