2,513 research outputs found
Two-person Graph Convolutional Network for Skeleton-based Human Interaction Recognition
Graph convolutional networks (GCNs) have been the predominant methods in
skeleton-based human action recognition, including human-human interaction
recognition. However, when dealing with interaction sequences, current
GCN-based methods simply split the two-person skeleton into two discrete graphs
and perform graph convolution separately as done for single-person action
classification. Such operations ignore rich interactive information and hinder
effective spatial inter-body relationship modeling. To overcome the above
shortcoming, we introduce a novel unified two-person graph to represent
inter-body and intra-body correlations between joints. Experiments show
accuracy improvements in recognizing both interactions and individual actions
when utilizing the proposed two-person graph topology. In addition, We design
several graph labeling strategies to supervise the model to learn discriminant
spatial-temporal interactive features. Finally, we propose a two-person graph
convolutional network (2P-GCN). Our model achieves state-of-the-art results on
four benchmarks of three interaction datasets: SBU, interaction subsets of
NTU-RGB+D and NTU-RGB+D 120
Multi-Dimensional Refinement Graph Convolutional Network with Robust Decouple Loss for Fine-Grained Skeleton-Based Action Recognition
Graph convolutional networks have been widely used in skeleton-based action
recognition. However, existing approaches are limited in fine-grained action
recognition due to the similarity of inter-class data. Moreover, the noisy data
from pose extraction increases the challenge of fine-grained recognition. In
this work, we propose a flexible attention block called Channel-Variable
Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of
spatial-temporal joints and obtain a more compact intra-class feature
distribution. Based on CVSTA, we construct a Multi-Dimensional Refinement Graph
Convolutional Network (MDR-GCN), which can improve the discrimination among
channel-, joint- and frame-level features for fine-grained actions.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly
boosts the effect of the CVSTA and reduces the impact of noise. The proposed
method combining MDR-GCN with RDL outperforms the known state-of-the-art
skeleton-based approaches on fine-grained datasets, FineGym99 and FSD-10, and
also on the coarse dataset NTU-RGB+D X-view version
Attention module-based spatial-temporal graph convolutional networks for skeleton-based action recognition
Skeleton-based action recognition is a significant direction of human action recognition, because the skeleton contains important information for recognizing action. The spatial-temporal graph convolutional networks (ST-GCN) automatically learn both the temporal and spatial features from the skeleton data and achieve remarkable performance for skeleton-based action recognition. However, ST-GCN just learns local information on a certain neighborhood but does not capture the correlation information between all joints (i.e., global information). Therefore, we need to introduce global information into the ST-GCN. We propose a model of dynamic skeletons called attention module-based-ST-GCN, which solves these problems by adding attention module. The attention module can capture some global information, which brings stronger expressive power and generalization capability. Experimental results on two large-scale datasets, Kinetics and NTU-RGB+D, demonstrate that our model achieves significant improvements over previous representative methods. © 2019 SPIE and IS&T
- …