154 research outputs found

    RGB-D-based Action Recognition Datasets: A Survey

    Get PDF
    Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation against state-of-the-art methods. To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and 7 multi-person datasets. The detailed information and analysis of these datasets is a useful resource in guiding insightful selection of datasets for future research. In addition, the issues with current algorithm evaluation vis-\'{a}-vis limitations of the available datasets and evaluation protocols are also highlighted; resulting in a number of recommendations for collection of new datasets and use of evaluation protocols

    A preliminary study of micro-gestures:dataset collection and analysis with multi-modal dynamic networks

    Get PDF
    Abstract. Micro-gestures (MG) are gestures that people performed spontaneously during communication situations. A preliminary exploration of Micro-Gesture is made in this thesis. By collecting recorded sequences of body gestures in a spontaneous state during games, a MG dataset is built through Kinect V2. A novel term ‘micro-gesture’ is proposed by analyzing the properties of MG dataset. Implementations of two sets of neural network architectures are achieved for micro-gestures segmentation and recognition task, which are the DBN-HMM model and the 3DCNN-HMM model for skeleton data and RGB-D data respectively. We also explore a method for extracting neutral states used in the HMM structure by detecting the activity level of the gesture sequences. The method is simple to derive and implement, and proved to be effective. The DBN-HMM and 3DCNN-HMM architectures are evaluated on MG dataset and optimized for the properties of micro-gestures. Experimental results show that we are able to achieve micro-gesture segmentation and recognition with satisfied accuracy with these two models. The work we have done about the micro-gestures in this thesis also explores a new research path for gesture recognition. Therefore, we believe that our work could be widely used as a baseline for future research on micro-gestures

    Articulated motion and deformable objects

    Get PDF
    This guest editorial introduces the twenty two papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field

    Spatiotemporal analysis of human actions using RGB-D cameras

    Get PDF
    Markerless human motion analysis has strong potential to provide cost-efficient solution for action recognition and body pose estimation. Many applications including humancomputer interaction, video surveillance, content-based video indexing, and automatic annotation among others will benefit from a robust solution to these problems. Depth sensing technologies in recent years have positively changed the climate of the automated vision-based human action recognition problem, deemed to be very difficult due to the various ambiguities inherent to conventional video. In this work, first a large set of invariant spatiotemporal features is extracted from skeleton joints (retrieved from depth sensor) in motion and evaluated as baseline performance. Next we introduce a discriminative Random Decision Forest-based feature selection framework capable of reaching impressive action recognition performance when combined with a linear SVM classifier. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). The approach can also be used to provide insights on the spatiotemporal dynamics of human actions. A novel therapeutic action recognition dataset (WorkoutSU-10) is presented. We took advantage of this dataset as a benchmark in our tests to evaluate the reliability of our proposed methods. Recently the dataset has been published publically as a contribution to the action recognition community. In addition, an interactive action evaluation application is developed by utilizing the proposed methods to help with real life problems such as 'fall detection' in the elderly people or automated therapy program for patients with motor disabilities
    • …
    corecore