4,690 research outputs found
Two-Stream RNN/CNN for Action Recognition in 3D Videos
The recognition of actions from video sequences has many applications in
health monitoring, assisted living, surveillance, and smart homes. Despite
advances in sensing, in particular related to 3D video, the methodologies to
process the data are still subject to research. We demonstrate superior results
by a system which combines recurrent neural networks with convolutional neural
networks in a voting approach. The gated-recurrent-unit-based neural networks
are particularly well-suited to distinguish actions based on long-term
information from optical tracking data; the 3D-CNNs focus more on detailed,
recent information from video data. The resulting features are merged in an SVM
which then classifies the movement. In this architecture, our method improves
recognition rates of state-of-the-art methods by 14% on standard data sets.Comment: Published in 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks
Human action recognition in 3D skeleton sequences has attracted a lot of
research attention. Recently, Long Short-Term Memory (LSTM) networks have shown
promising performance in this task due to their strengths in modeling the
dependencies and dynamics in sequential data. As not all skeletal joints are
informative for action recognition, and the irrelevant joints often bring noise
which can degrade the performance, we need to pay more attention to the
informative ones. However, the original LSTM network does not have explicit
attention ability. In this paper, we propose a new class of LSTM network,
Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action
recognition. This network is capable of selectively focusing on the informative
joints in each frame of each skeleton sequence by using a global context memory
cell. To further improve the attention capability of our network, we also
introduce a recurrent attention mechanism, with which the attention performance
of the network can be enhanced progressively. Moreover, we propose a stepwise
training scheme in order to train our network effectively. Our approach
achieves state-of-the-art performance on five challenging benchmark datasets
for skeleton based action recognition
Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks
Skeleton based action recognition distinguishes human actions using the
trajectories of skeleton joints, which provide a very good representation for
describing actions. Considering that recurrent neural networks (RNNs) with Long
Short-Term Memory (LSTM) can learn feature representations and model long-term
temporal dependencies automatically, we propose an end-to-end fully connected
deep LSTM network for skeleton based action recognition. Inspired by the
observation that the co-occurrences of the joints intrinsically characterize
human actions, we take the skeleton as the input at each time slot and
introduce a novel regularization scheme to learn the co-occurrence features of
skeleton joints. To train the deep LSTM network effectively, we propose a new
dropout algorithm which simultaneously operates on the gates, cells, and output
responses of the LSTM neurons. Experimental results on three human action
recognition datasets consistently demonstrate the effectiveness of the proposed
model.Comment: AAAI 2016 conferenc
- …