1,376 research outputs found
Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons
Current methods for skeleton-based human action recognition usually work with
completely observed skeletons. However, in real scenarios, it is prone to
capture incomplete and noisy skeletons, which will deteriorate the performance
of traditional models. To enhance the robustness of action recognition models
to incomplete skeletons, we propose a multi-stream graph convolutional network
(GCN) for exploring sufficient discriminative features distributed over all
skeleton joints. Here, each stream of the network is only responsible for
learning features from currently unactivated joints, which are distinguished by
the class activation maps (CAM) obtained by preceding streams, so that the
activated joints of the proposed method are obviously more than traditional
methods. Thus, the proposed method is termed richly activated GCN (RA-GCN),
where the richly discovered features will improve the robustness of the model.
Compared to the state-of-the-art methods, the RA-GCN achieves comparable
performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion
dataset, the performance deterioration can be alleviated by the RA-GCN
significantly.Comment: Accepted by ICIP 2019, 5 pages, 3 figures, 3 table
Modeling Temporal Dynamics and Spatial Configurations of Actions Using Two-Stream Recurrent Neural Networks
Recently, skeleton based action recognition gains more popularity due to
cost-effective depth sensors coupled with real-time skeleton estimation
algorithms. Traditional approaches based on handcrafted features are limited to
represent the complexity of motion patterns. Recent methods that use Recurrent
Neural Networks (RNN) to handle raw skeletons only focus on the contextual
dependency in the temporal domain and neglect the spatial configurations of
articulated skeletons. In this paper, we propose a novel two-stream RNN
architecture to model both temporal dynamics and spatial configurations for
skeleton based action recognition. We explore two different structures for the
temporal stream: stacked RNN and hierarchical RNN. Hierarchical RNN is designed
according to human body kinematics. We also propose two effective methods to
model the spatial structure by converting the spatial graph into a sequence of
joints. To improve generalization of our model, we further exploit 3D
transformation based data augmentation techniques including rotation and
scaling transformation to transform the 3D coordinates of skeletons during
training. Experiments on 3D action recognition benchmark datasets show that our
method brings a considerable improvement for a variety of actions, i.e.,
generic actions, interaction activities and gestures.Comment: Accepted to IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR) 201
- …