4 research outputs found
Regional Attention with Architecture-Rebuilt 3D Network for RGB-D Gesture Recognition
Human gesture recognition has drawn much attention in the area of computer
vision. However, the performance of gesture recognition is always influenced by
some gesture-irrelevant factors like the background and the clothes of
performers. Therefore, focusing on the regions of hand/arm is important to the
gesture recognition. Meanwhile, a more adaptive architecture-searched network
structure can also perform better than the block-fixed ones like Resnet since
it increases the diversity of features in different stages of the network
better. In this paper, we propose a regional attention with
architecture-rebuilt 3D network (RAAR3DNet) for gesture recognition. We replace
the fixed Inception modules with the automatically rebuilt structure through
the network via Neural Architecture Search (NAS), owing to the different shape
and representation ability of features in the early, middle, and late stage of
the network. It enables the network to capture different levels of feature
representations at different layers more adaptively. Meanwhile, we also design
a stackable regional attention module called dynamic-static Attention (DSA),
which derives a Gaussian guidance heatmap and dynamic motion map to highlight
the hand/arm regions and the motion information in the spatial and temporal
domains, respectively. Extensive experiments on two recent large-scale RGB-D
gesture datasets validate the effectiveness of the proposed method and show it
outperforms state-of-the-art methods. The codes of our method are available at:
https://github.com/zhoubenjia/RAAR3DNet.Comment: Accepted by AAAI 202
Spatio-temporal reconstruction for 3D motion recovery
—This paper addresses the challenge of 3D motion
recovery by exploiting the spatio-temporal correlations of corrupted 3D skeleton sequences. We propose a new 3D motion recovery method using spatio-temporal reconstruction, which uses
joint low-rank and sparse priors to exploit temporal correlation
and an isometric constraint for spatial correlation. The proposed
model is formulated as a constrained optimization problem,
which is efficiently solved by the augmented Lagrangian method
with a Gauss-Newton solver for the subproblem of isometric
optimization. Experimental results on the CMU motion capture
dataset, Edinburgh dataset and two Kinect datasets demonstrate
that the proposed approach achieves better motion recovery
than state-of-the-art methods. The proposed method is applicable
to Kinect-like skeleton tracking devices and pose estimation
methods that cannot provide accurate estimation of complex
motions, especially in the presence of occlusion