Search CORE

5 research outputs found

Action Recognition by Hierarchical Mid-level Action Elements

Author: Lan Tian
Savarese Silvio
Zamir Amir Roshan
Zhu Yuke
Publication venue
Publication date: 30/08/2015
Field of study

Realistic videos of human actions exhibit rich spatiotemporal structures at multiple levels of granularity: an action can always be decomposed into multiple finer-grained elements in both space and time. To capture this intuition, we propose to represent videos by a hierarchy of mid-level action elements (MAEs), where each MAE corresponds to an action-related spatiotemporal segment in the video. We introduce an unsupervised method to generate this representation from videos. Our method is capable of distinguishing action-related segments from background segments and representing actions at multiple spatiotemporal resolutions. Given a set of spatiotemporal segments generated from the training data, we introduce a discriminative clustering algorithm that automatically discovers MAEs at multiple levels of granularity. We develop structured models that capture a rich set of spatial, temporal and hierarchical relations among the segments, where the action label and multiple levels of MAE labels are jointly inferred. The proposed model achieves state-of-the-art performance in multiple action recognition benchmarks. Moreover, we demonstrate the effectiveness of our model in real-world applications such as action recognition in large-scale untrimmed videos and action parsing

arXiv.org e-Print Archive

Crossref

Learning action primitives for multi-level video event understanding

Author: Chen Lei
Publication venue
Publication date: 21/12/2015
Field of study

Human action categories exhibit significant intra-class variation. Changes in viewpoint, human appearance, and the temporal evolution of an action confound recognition algorithms. In order to address this, we present an approach to discover action primitives, sub-categoriesof action classes, that allow us to model this intra-class variation. We learn action primitives and their interrelations in a multi-level spatio-temporal model for action recognition. Action primitives are discovered via a data-driven clustering approach that focuses on repeatable,discriminative sub-categories. Higher-level interactions between action primitives and the actions of a set of people present in a scene are learned. Empirical results demonstrate that these action primitives can be effectively localized, and using them to model action classesimproves action recognition performance on challenging datasets

Simon Fraser University Institutional Repository

Learning Action Primitives for Multi-level Video Event Understanding

Author: C Gu
C Gu
G Médioni
K Crammer
M Everingham
MR Amer
PF Felzenszwalb
W Choi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref