Search CORE

42 research outputs found

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

Author: Lan Cuiling
Li Yanghao
Shen Li
Xie Xiaohui
Xing Junliang
Zeng Wenjun
Zhu Wentao
Publication venue
Publication date: 05/03/2016
Field of study

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.Comment: AAAI 2016 conferenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

Author: Damen Dima
Mayol-Cuevas Walterio
Moltisanti Davide
Wray Michael
Publication venue
Publication date: 26/07/2017
Field of study

Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10% is observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network. We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4% increase in overall accuracy, and an increase in accuracy for 55% of classes when Rubicon Boundaries are used for temporal annotations.Comment: ICCV 201

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Skeleton based Human Action Recognition using a Structured-Tree Neural Network

Author: Bahoo Nisar
Karim Misha
Khalid Muhammad Junaid
Khan Muhammad Sajid
Ware Andrew
Publication venue: 'European Open Access Publishing (Europa Publishing)'
Publication date: 13/08/2020
Field of study

University of South Wales Research Explorer

A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset

Author: Chaaraoui Alexandros André
Flórez-Revuelta Francisco
Padilla-López José Ramón
Publication venue
Publication date: 29/07/2014
Field of study

This paper aims to determine which is the best human action recognition method based on features extracted from RGB-D devices, such as the Microsoft Kinect. A review of all the papers that make reference to MSR Action3D, the most used dataset that includes depth information acquired from a RGB-D device, has been performed. We found that the validation method used by each work differs from the others. So, a direct comparison among works cannot be made. However, almost all the works present their results comparing them without taking into account this issue. Therefore, we present different rankings according to the methodology used for the validation in orden to clarify the existing confusion.Comment: 16 pages and 7 table

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

Multimodal Multipart Learning for Action Recognition in Depth Videos

Author: Ng Tian-Tsong
Shahroudy Amir
Wang Gang
Yang Qingxiong
Publication venue
Publication date: 31/07/2015
Field of study

The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. To represent dynamics and appearance of parts, we employ a heterogeneous set of depth and skeleton based features. The proper structure of multimodal multipart features are formulated into the learning framework via the proposed hierarchical mixed norm, to regularize the structured features of each part and to apply sparsity between them, in favor of a group feature selection. Our experimental results expose the effectiveness of the proposed learning method in which it outperforms other methods in all three tested datasets while saturating one of them by achieving perfect accuracy

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Comparison of Activity Recognition Using 2D and 3D Skeletal Joint Data

Author: Marshall Fiona
Scotney Bryan
Zhang Shuai
Publication venue: Technological University Dublin
Publication date: 01/01/2019
Field of study

Arrow@TUDublin