6 research outputs found

    Human Action Recognition Based on Temporal Pyramid of Key Poses Using RGB-D Sensors

    Get PDF
    Human action recognition is a hot research topic in computer vision, mainly due to the high number of related applications, such as surveillance, human computer interaction, or assisted living. Low cost RGB-D sensors have been extensively used in this field. They can provide skeleton joints, which represent a compact and effective representation of the human posture. This work proposes an algorithm for human action recognition where the features are computed from skeleton joints. A sequence of skeleton features is represented as a set of key poses, from which histograms are extracted. The temporal structure of the sequence is kept using a temporal pyramid of key poses. Finally, a multi-class SVM performs the classification task. The algorithm optimization through evolutionary computation allows to reach results comparable to the state-of-the-art on the MSR Action3D dataset.This work was supported by a STSM Grant from COST Action IC1303 AAPELE - Architectures, Algorithms and Platforms for Enhanced Living Environments

    Discriminative joint non-negative matrix factorization for human action classification

    No full text
    This paper describes a supervised classification approach based on non-negative matrix factorization (NMF). Our classification framework builds on the recent expansions of non-negative matrix factorization to multiview learning, where the primary dataset benefits from auxiliary information for obtaining shared and meaningful spaces. For discrimination, we utilize data categories in a supervised manner as an auxiliary source of information in order to learn co-occurrences through a common set of basis vectors. We demonstrate the efficiency of our algorithm in integrating various image modalities for enhancing the overall classification accuracy over different benchmark datasets. Our evaluation considers two challenging image datasets of human action recognition. We show that our algorithm achieves superior results over state-of-the-art in terms of efficiency and overall classification accuracy

    Learning spatial interest regions from videos to inform action recognition in still images

    No full text
    Common approaches to human action recognition from images rely on local descriptors for classification. Typically, these descriptors are computed in the vicinity of key points which either result from running a key point detector or from dense or random sampling of pixel coordinates. Such key points are not a-priori related to human activities and thus of limited information with regard to action recognition. In this paper, we propose to identify action-specific key points in images using information available from videos. Our approach does not require manual segmentation or templates but applies non-negative matrix factorization to optical flow fields extracted from videos. The resulting basisflows are found to to be indicative of action specific image regions and therefore allow for an informed sampling of key points. We also present a generative model that allows for characterizing joint distributions of regions of interest and a human actions. In practical experiments, we determine correspondences between regions of interest that were automatically learned from videos and manually annotated locations of human body parts available from independent benchmark image data sets. We observe high correlations between learned interest regions and body parts most relevant for different actions

    Exploiting Context Information for Image Description

    No full text
    Integrating ontological knowledge is a promising research direction to improve automatic image description. In particular, when probabilistic ontologies are available, the corresponding probabilities could be combined with the probabilities produced by a multi-class classifier applied to different parts in an image. This combination not only provides the relations existing between the different segments, but can also improve the classification accuracy. In fact, the context often gives cues suggesting the correct class of the segment. This paper discusses a possible implementation of this integration, and the first experimental results shows its effectiveness when the classifier accuracy is relatively low. For the assessment of the performance we constructed a simulated classifier which allows the a priori decision of its performance with a sufficient precision
    corecore