1,428 research outputs found
Investigation of Different Skeleton Features for CNN-based 3D Action Recognition
Deep learning techniques are being used in skeleton based action recognition
tasks and outstanding performance has been reported. Compared with RNN based
methods which tend to overemphasize temporal information, CNN-based approaches
can jointly capture spatio-temporal information from texture color images
encoded from skeleton sequences. There are several skeleton-based features that
have proven effective in RNN-based and handcrafted-feature-based methods.
However, it remains unknown whether they are suitable for CNN-based approaches.
This paper proposes to encode five spatial skeleton features into images with
different encoding methods. In addition, the performance implication of
different joints used for feature extraction is studied. The proposed method
achieved state-of-the-art performance on NTU RGB+D dataset for 3D human action
analysis. An accuracy of 75.32\% was achieved in Large Scale 3D Human Activity
Analysis Challenge in Depth Videos
Deep Affordance-grounded Sensorimotor Object Recognition
It is well-established by cognitive neuroscience that human perception of
objects constitutes a complex process, where object appearance information is
combined with evidence about the so-called object "affordances", namely the
types of actions that humans typically perform when interacting with them. This
fact has recently motivated the "sensorimotor" approach to the challenging task
of automatic object recognition, where both information sources are fused to
improve robustness. In this work, the aforementioned paradigm is adopted,
surpassing current limitations of sensorimotor object recognition research.
Specifically, the deep learning paradigm is introduced to the problem for the
first time, developing a number of novel neuro-biologically and
neuro-physiologically inspired architectures that utilize state-of-the-art
neural networks for fusing the available information sources in multiple ways.
The proposed methods are evaluated using a large RGB-D corpus, which is
specifically collected for the task of sensorimotor object recognition and is
made publicly available. Experimental results demonstrate the utility of
affordance information to object recognition, achieving an up to 29% relative
error reduction by its inclusion.Comment: 9 pages, 7 figures, dataset link included, accepted to CVPR 201
- …