Search CORE

32 research outputs found

A decision forest based feature selection framework for action recognition from RGB-Depth cameras

Author: Akgul Ceyhun Burak
Akgül Ceyhun Burak
Ercil Aytul
Erçil Aytül
Negin Farhood
Ozdemir Firat
Yuksel Kamer Ali
Yüksel Kamer Ali
Özdemir Fırat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2013
Field of study

In this paper, we present an action recognition framework leveraging data mining capabilities of random decision forests trained on kinematic features. We describe human motion via a rich collection of kinematic feature time-series computed from the skeletal representation of the body in motion. We discriminatively optimize a random decision forest model over this collection to identify the most effective subset of features, localized both in time and space. Later, we train a support vector machine classifier on the selected features. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). On MSRC-12 dataset (12 classes), our method achieves 94% accuracy. On the WorkoutSU-10 dataset, collected by our group (10 physical exercise classes), the accuracy is 98%. The approach can also be used to provide insights on the spatiotemporal dynamics of human actions

Crossref

Sabanci University Research Database

Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets

Author: Hussein Mohamed E.
Meshry Moustafa
Torki Marwan
Publication venue
Publication date: 28/12/2015
Field of study

Sliding window is one direct way to extend a successful recognition system to handle the more challenging detection problem. While action recognition decides only whether or not an action is present in a pre-segmented video sequence, action detection identifies the time interval where the action occurred in an unsegmented video stream. Sliding window approaches for action detection can however be slow as they maximize a classifier score over all possible sub-intervals. Even though new schemes utilize dynamic programming to speed up the search for the optimal sub-interval, they require offline processing on the whole video sequence. In this paper, we propose a novel approach for online action detection based on 3D skeleton sequences extracted from depth data. It identifies the sub-interval with the maximum classifier score in linear time. Furthermore, it is invariant to temporal scale variations and is suitable for real-time applications with low latency

arXiv.org e-Print Archive

Crossref

A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset

Author: Chaaraoui Alexandros André
Flórez-Revuelta Francisco
Padilla-López José Ramón
Publication venue
Publication date: 29/07/2014
Field of study

This paper aims to determine which is the best human action recognition method based on features extracted from RGB-D devices, such as the Microsoft Kinect. A review of all the papers that make reference to MSR Action3D, the most used dataset that includes depth information acquired from a RGB-D device, has been performed. We found that the validation method used by each work differs from the others. So, a direct comparison among works cannot be made. However, almost all the works present their results comparing them without taking into account this issue. Therefore, we present different rankings according to the methodology used for the validation in orden to clarify the existing confusion.Comment: 16 pages and 7 table

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

Simultaneous Feature and Body-Part Learning for Real-Time Robot Awareness of Human Behaviors

Author: Han Fei
Reardon Christopher
Yang Xue
Zhang Hao
Zhang Yu
Publication venue
Publication date: 24/02/2017
Field of study

Robot awareness of human actions is an essential research problem in robotics with many important real-world applications, including human-robot collaboration and teaming. Over the past few years, depth sensors have become a standard device widely used by intelligent robots for 3D perception, which can also offer human skeletal data in 3D space. Several methods based on skeletal data were designed to enable robot awareness of human actions with satisfactory accuracy. However, previous methods treated all body parts and features equally important, without the capability to identify discriminative body parts and features. In this paper, we propose a novel simultaneous Feature And Body-part Learning (FABL) approach that simultaneously identifies discriminative body parts and features, and efficiently integrates all available information together to enable real-time robot awareness of human behaviors. We formulate FABL as a regression-like optimization problem with structured sparsity-inducing norms to model interrelationships of body parts and features. We also develop an optimization algorithm to solve the formulated problem, which possesses a theoretical guarantee to find the optimal solution. To evaluate FABL, three experiments were performed using public benchmark datasets, including the MSR Action3D and CAD-60 datasets, as well as a Baxter robot in practical assistive living applications. Experimental results show that our FABL approach obtains a high recognition accuracy with a processing speed of the order-of-magnitude of 10e4 Hz, which makes FABL a promising method to enable real-time robot awareness of human behaviors in practical robotics applications.Comment: 8 pages, 6 figures, accepted by ICRA'1

arXiv.org e-Print Archive

Crossref

Two-Stream RNN/CNN for Action Recognition in 3D Videos

Author: Ali Haider
van der Smagt Patrick
Zhao Rui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/10/2018
Field of study

The recognition of actions from video sequences has many applications in health monitoring, assisted living, surveillance, and smart homes. Despite advances in sensing, in particular related to 3D video, the methodologies to process the data are still subject to research. We demonstrate superior results by a system which combines recurrent neural networks with convolutional neural networks in a voting approach. The gated-recurrent-unit-based neural networks are particularly well-suited to distinguish actions based on long-term information from optical tracking data; the 3D-CNNs focus more on detailed, recent information from video data. The resulting features are merged in an SVM which then classifies the movement. In this architecture, our method improves recognition rates of state-of-the-art methods by 14% on standard data sets.Comment: Published in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

arXiv.org e-Print Archive

Crossref

Disturbance Grassmann Kernels for Subspace-Based Learning

Author: Chen Huanhuan
Chen Ning
Maaten Laurens
Wager Stefan
Wang Boyue
Wang Boyue
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/06/2018
Field of study

In this paper, we focus on subspace-based learning problems, where data elements are linear subspaces instead of vectors. To handle this kind of data, Grassmann kernels were proposed to measure the space structure and used with classifiers, e.g., Support Vector Machines (SVMs). However, the existing discriminative algorithms mostly ignore the instability of subspaces, which would cause the classifiers misled by disturbed instances. Thus we propose considering all potential disturbance of subspaces in learning processes to obtain more robust classifiers. Firstly, we derive the dual optimization of linear classifiers with disturbance subject to a known distribution, resulting in a new kernel, Disturbance Grassmann (DG) kernel. Secondly, we research into two kinds of disturbance, relevant to the subspace matrix and singular values of bases, with which we extend the Projection kernel on Grassmann manifolds to two new kernels. Experiments on action data indicate that the proposed kernels perform better compared to state-of-the-art subspace-based methods, even in a worse environment.Comment: This paper include 3 figures, 10 pages, and has been accpeted to SIGKDD'1

arXiv.org e-Print Archive

Crossref