Search CORE

9,725 research outputs found

Log-Euclidean Bag of Words for Human Action Recognition

Author: Bhatia R.
Conrad Sanderson
Lazebnik S.
Masoud Faraki
Maziar Palhang
Wong Y.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2015
Field of study

Representing videos by densely extracted local space-time features has recently become a popular approach for analysing actions. In this paper, we tackle the problem of categorising human actions by devising Bag of Words (BoW) models based on covariance matrices of spatio-temporal features, with the features formed from histograms of optical flow. Since covariance matrices form a special type of Riemannian manifold, the space of Symmetric Positive Definite (SPD) matrices, non-Euclidean geometry should be taken into account while discriminating between covariance matrices. To this end, we propose to embed SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW approach to its Riemannian version. The proposed BoW approach takes into account the manifold geometry of SPD matrices during the generation of the codebook and histograms. Experiments on challenging human action datasets show that the proposed method obtains notable improvements in discrimination accuracy, in comparison to several state-of-the-art methods

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Queensland University of Technology ePrints Archive

University of Queensland eSpace

LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning

Author: A Chan
A. Bruderlin
B Solmaz
B Ulicny
B Zhou
D Helbing
F Lamarche
F Zhu
G Antonini
G Le Bon
Hans J. Eysenck
J Barraquand
J James
J van den Berg
J Xu
K Zhang
KK Reddy
L Pervin
Mehdi Moussaïd
R Geraerts
S Ali
S Ali
S Curtis
T Li
X Song
X Wang
Y Tsuduki
Publication venue
Publication date: 04/07/2016
Field of study

We present a novel procedural framework to generate an arbitrary number of labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to design accurate algorithms or training models for crowded scene understanding. Our overall approach is composed of two components: a procedural simulation framework for generating crowd movements and behaviors, and a procedural rendering framework to generate different videos or images. Each video or image is automatically labeled based on the environment, number of pedestrians, density, behavior, flow, lighting conditions, viewpoint, noise, etc. Furthermore, we can increase the realism by combining synthetically-generated behaviors with real-world background videos. We demonstrate the benefits of LCrowdV over prior lableled crowd datasets by improving the accuracy of pedestrian detection and crowd behavior classification algorithms. LCrowdV would be released on the WWW

arXiv.org e-Print Archive

Crossref

Geodesic Distance Histogram Feature for Video Segmentation

Author: A Kundu
EH Taralova
F Galasso
P Krähenbühl
T Brox
T Brox
T Leung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/03/2017
Field of study

This paper proposes a geodesic-distance-based feature that encodes global information for improved video segmentation algorithms. The feature is a joint histogram of intensity and geodesic distances, where the geodesic distances are computed as the shortest paths between superpixels via their boundaries. We also incorporate adaptive voting weights and spatial pyramid configurations to include spatial information into the geodesic histogram feature and show that this further improves results. The feature is generic and can be used as part of various algorithms. In experiments, we test the geodesic histogram feature by incorporating it into two existing video segmentation frameworks. This leads to significantly better performance in 3D video segmentation benchmarks on two datasets

arXiv.org e-Print Archive

Crossref