82 research outputs found
When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Human action recognition from skeletal data is a hot research topic and
important in many open domain applications of computer vision, thanks to
recently introduced 3D sensors. In the literature, naive methods simply
transfer off-the-shelf techniques from video to the skeletal representation.
However, the current state-of-the-art is contended between to different
paradigms: kernel-based methods and feature learning with (recurrent) neural
networks. Both approaches show strong performances, yet they exhibit heavy, but
complementary, drawbacks. Motivated by this fact, our work aims at combining
together the best of the two paradigms, by proposing an approach where a
shallow network is fed with a covariance representation. Our intuition is that,
as long as the dynamics is effectively modeled, there is no need for the
classification network to be deep nor recurrent in order to score favorably. We
validate this hypothesis in a broad experimental analysis over 6 publicly
available datasets.Comment: 2017 IEEE Computer Vision and Pattern Recognition (CVPR) Workshop
Riemannian Self-Attention Mechanism for SPD Networks
Symmetric positive definite (SPD) matrix has been demonstrated to be an
effective feature descriptor in many scientific areas, as it can encode
spatiotemporal statistics of the data adequately on a curved Riemannian
manifold, i.e., SPD manifold. Although there are many different ways to design
network architectures for SPD matrix nonlinear learning, very few solutions
explicitly mine the geometrical dependencies of features at different layers.
Motivated by the great success of self-attention mechanism in capturing
long-range relationships, an SPD manifold self-attention mechanism (SMSA) is
proposed in this paper using some manifold-valued geometric operations, mainly
the Riemannian metric, Riemannian mean, and Riemannian optimization. Then, an
SMSA-based geometric learning module (SMSA-GLM) is designed for the sake of
improving the discrimination of the generated deep structured representations.
Extensive experimental results achieved on three benchmarking datasets show
that our modification against the baseline network further alleviates the
information degradation problem and leads to improved accuracy.Comment: 14 pages, 10 figures, 5 table
Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach
Matrix manifolds, such as manifolds of Symmetric Positive Definite (SPD)
matrices and Grassmann manifolds, appear in many applications. Recently, by
applying the theory of gyrogroups and gyrovector spaces that is a powerful
framework for studying hyperbolic geometry, some works have attempted to build
principled generalizations of Euclidean neural networks on matrix manifolds.
However, due to the lack of many concepts in gyrovector spaces for the
considered manifolds, e.g., the inner product and gyroangles, techniques and
mathematical tools provided by these works are still limited compared to those
developed for studying hyperbolic geometry. In this paper, we generalize some
notions in gyrovector spaces for SPD and Grassmann manifolds, and propose new
models and layers for building neural networks on these manifolds. We show the
effectiveness of our approach in two applications, i.e., human action
recognition and knowledge graph completion
Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition
Hand action recognition is essential. Communication, human-robot
interactions, and gesture control are dependent on it. Skeleton-based action
recognition traditionally includes hands, which belong to the classes which
remain challenging to correctly recognize to date. We propose a method
specifically designed for hand action recognition which uses relative angular
embeddings and local Spherical Harmonics to create novel hand representations.
The use of Spherical Harmonics creates rotation-invariant representations which
make hand action recognition even more robust against inter-subject differences
and viewpoint changes. We conduct extensive experiments on the hand joints in
the First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose
Annotations, and on the NTU RGB+D 120 dataset, demonstrating the benefit of
using Local Spherical Harmonics Representations. Our code is available at
https://github.com/KathPra/LSHR_LSHT
- …