2,266 research outputs found

    An Improvements of Deep Learner Based Human Activity Recognition with the Aid of Graph Convolution Features

    Get PDF
    Many researchers are now focusing on Human Action Recognition (HAR), which is based on various deep-learning features related to body joints and their trajectories from videos. Among many schemes, Joints and Trajectory-pooled 3D-Deep Geometric Positional Attention-based Hierarchical Bidirectional Recurrent convolutional Descriptors (JTDGPAHBRD) can provide a video descriptor by learning geometric features and trajectories of the body joints. But the spatial-temporal dynamics of the different geometric features of the skeleton structure were not explored deeper. To solve this problem, this article develops the Graph Convolutional Network (GCN) in addition to the JTDGPAHBRD to create a video descriptor for HAR. The GCN can obtain complementary information, such as higher-level spatial-temporal features, between consecutive frames for enhancing end-to-end learning. In addition, to improve feature representation ability, a search space with several adaptive graph components is created. Then, a sampling and computation-effective evolution scheme are applied to explore this space. Moreover, the resultant GCN provides the temporal dynamics of the skeleton pattern, which are fused with the geometric features of the skeleton body joints and trajectory coordinates from the JTDGPAHBRD to create a more effective video descriptor for HAR. Finally, extensive experiments show that the JTDGPAHBRD-GCN model outperforms the existing HAR models on the Penn Action Dataset (PAD)

    Modeling Temporal Dynamics and Spatial Configurations of Actions Using Two-Stream Recurrent Neural Networks

    Full text link
    Recently, skeleton based action recognition gains more popularity due to cost-effective depth sensors coupled with real-time skeleton estimation algorithms. Traditional approaches based on handcrafted features are limited to represent the complexity of motion patterns. Recent methods that use Recurrent Neural Networks (RNN) to handle raw skeletons only focus on the contextual dependency in the temporal domain and neglect the spatial configurations of articulated skeletons. In this paper, we propose a novel two-stream RNN architecture to model both temporal dynamics and spatial configurations for skeleton based action recognition. We explore two different structures for the temporal stream: stacked RNN and hierarchical RNN. Hierarchical RNN is designed according to human body kinematics. We also propose two effective methods to model the spatial structure by converting the spatial graph into a sequence of joints. To improve generalization of our model, we further exploit 3D transformation based data augmentation techniques including rotation and scaling transformation to transform the 3D coordinates of skeletons during training. Experiments on 3D action recognition benchmark datasets show that our method brings a considerable improvement for a variety of actions, i.e., generic actions, interaction activities and gestures.Comment: Accepted to IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 201

    Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons

    Full text link
    Current methods for skeleton-based human action recognition usually work with completely observed skeletons. However, in real scenarios, it is prone to capture incomplete and noisy skeletons, which will deteriorate the performance of traditional models. To enhance the robustness of action recognition models to incomplete skeletons, we propose a multi-stream graph convolutional network (GCN) for exploring sufficient discriminative features distributed over all skeleton joints. Here, each stream of the network is only responsible for learning features from currently unactivated joints, which are distinguished by the class activation maps (CAM) obtained by preceding streams, so that the activated joints of the proposed method are obviously more than traditional methods. Thus, the proposed method is termed richly activated GCN (RA-GCN), where the richly discovered features will improve the robustness of the model. Compared to the state-of-the-art methods, the RA-GCN achieves comparable performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion dataset, the performance deterioration can be alleviated by the RA-GCN significantly.Comment: Accepted by ICIP 2019, 5 pages, 3 figures, 3 table
    • …
    corecore