300 research outputs found
Accurate 3D Action Recognition using Learning on the Grassmann Manifold
International audienceIn this paper we address the problem of modelling and analyzing human motion by focusing on 3D body skeletons. Particularly, our intent is to represent skeletal motion in a geometric and efficient way, leading to an accurate action-recognition system. Here an action is represented by a dynamical system whose observability matrix is characterized as an element of a Grassmann manifold. To formulate our learning algorithm, we propose two distinct ideas: (1) In the first one we perform classification using a Truncated Wrapped Gaussian model, one for each class in its own tangent space. (2) In the second one we propose a novel learning algorithm that uses a vector representation formed by concatenating local coordinates in tangent spaces associated with different classes and training a linear SVM. %\cite{Turaga:2011:PAMI:ActionOnGrassman} We evaluate our approaches on three public 3D action datasets: MSR-action 3D, UT-kinect and UCF-kinect datasets; these datasets represent different kinds of challenges and together help provide an exhaustive evaluation. The results show that our approaches either match or exceed state-of-the-art performance reaching 91.21\% on MSR-action 3D, 97.91\% on UCF-kinect, and 88.5\% on UT-kinect. Finally, we evaluate the latency, i.e. the ability to recognize an action before its termination, of our approach and demonstrate improvements relative to other published approaches
Disturbance Grassmann Kernels for Subspace-Based Learning
In this paper, we focus on subspace-based learning problems, where data
elements are linear subspaces instead of vectors. To handle this kind of data,
Grassmann kernels were proposed to measure the space structure and used with
classifiers, e.g., Support Vector Machines (SVMs). However, the existing
discriminative algorithms mostly ignore the instability of subspaces, which
would cause the classifiers misled by disturbed instances. Thus we propose
considering all potential disturbance of subspaces in learning processes to
obtain more robust classifiers. Firstly, we derive the dual optimization of
linear classifiers with disturbance subject to a known distribution, resulting
in a new kernel, Disturbance Grassmann (DG) kernel. Secondly, we research into
two kinds of disturbance, relevant to the subspace matrix and singular values
of bases, with which we extend the Projection kernel on Grassmann manifolds to
two new kernels. Experiments on action data indicate that the proposed kernels
perform better compared to state-of-the-art subspace-based methods, even in a
worse environment.Comment: This paper include 3 figures, 10 pages, and has been accpeted to
SIGKDD'1
Deep Learning on Lie Groups for Skeleton-based Action Recognition
In recent years, skeleton-based action recognition has become a popular 3D
classification problem. State-of-the-art methods typically first represent each
motion sequence as a high-dimensional trajectory on a Lie group with an
additional dynamic time warping, and then shallowly learn favorable Lie group
features. In this paper we incorporate the Lie group structure into a deep
network architecture to learn more appropriate Lie group features for 3D action
recognition. Within the network structure, we design rotation mapping layers to
transform the input Lie group features into desirable ones, which are aligned
better in the temporal domain. To reduce the high feature dimensionality, the
architecture is equipped with rotation pooling layers for the elements on the
Lie group. Furthermore, we propose a logarithm mapping layer to map the
resulting manifold data into a tangent space that facilitates the application
of regular output layers for the final classification. Evaluations of the
proposed network for standard 3D human action recognition datasets clearly
demonstrate its superiority over existing shallow Lie group feature learning
methods as well as most conventional deep learning methods.Comment: Accepted to CVPR 201
Grassmann Learning for Recognition and Classification
Computational performance associated with high-dimensional data is a common challenge for real-world classification and recognition systems. Subspace learning has received considerable attention as a means of finding an efficient low-dimensional representation that leads to better classification and efficient processing. A Grassmann manifold is a space that promotes smooth surfaces, where points represent subspaces and the relationship between points is defined by a mapping of an orthogonal matrix. Grassmann learning involves embedding high dimensional subspaces and kernelizing the embedding onto a projection space where distance computations can be effectively performed. In this dissertation, Grassmann learning and its benefits towards action classification and face recognition in terms of accuracy and performance are investigated and evaluated. Grassmannian Sparse Representation (GSR) and Grassmannian Spectral Regression (GRASP) are proposed as Grassmann inspired subspace learning algorithms. GSR is a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss §¤1-norm minimization for improved classification. GRASP is a novel subspace learning algorithm that leverages the benefits of Grassmann manifolds and Spectral Regression in a framework that supports high discrimination between classes and achieves computational benefits by using manifold modeling and avoiding eigen-decomposition. The effectiveness of GSR and GRASP is demonstrated for computationally intensive classification problems: (a) multi-view action classification using the IXMAS Multi-View dataset, the i3DPost Multi-View dataset, and the WVU Multi-View dataset, (b) 3D action classification using the MSRAction3D dataset and MSRGesture3D dataset, and (c) face recognition using the ATT Face Database, Labeled Faces in the Wild (LFW), and the Extended Yale Face Database B (YALE). Additional contributions include the definition of Motion History Surfaces (MHS) and Motion Depth Surfaces (MDS) as descriptors suitable for activity representations in video sequences and 3D depth sequences. An in-depth analysis of Grassmann metrics is applied on high dimensional data with different levels of noise and data distributions which reveals that standardized Grassmann kernels are favorable over geodesic metrics on a Grassmann manifold. Finally, an extensive performance analysis is made that supports Grassmann subspace learning as an effective approach for classification and recognition
Activity Representation from Video Using Statistical Models on Shape Manifolds
Activity recognition from video data is a key computer vision problem with applications in surveillance, elderly care, etc. This problem is associated with modeling a representative shape which contains significant information about the underlying activity. In this dissertation, we represent several approaches for view-invariant activity recognition via modeling shapes on various shape spaces and Riemannian manifolds.
The first two parts of this dissertation deal with activity modeling and recognition using tracks of landmark feature points. The motion trajectories of points extracted from objects involved in the activity are used to build deformation shape models for each activity, and these models are used for classification and detection of unusual activities. In the first part of the dissertation, these models are represented by the recovered 3D deformation basis shapes corresponding to the activity using a non-rigid structure from motion formulation. We use a theory for estimating the amount of deformation for these models from the visual data. We study the special case of ground plane activities in detail because of its importance in video surveillance applications.
In the second part of the dissertation, we propose to model the activity by learning an affine invariant deformation subspace representation that captures the space of possible body poses associated with the activity. These subspaces can be viewed as points on a Grassmann manifold. We propose several statistical classification models on Grassmann manifold that capture the statistical variations of the shape data while following the intrinsic Riemannian geometry of these manifolds.
The last part of this dissertation addresses the problem of recognizing human gestures from silhouette images. We represent a human gesture as a temporal sequence of human poses, each characterized by a contour of the associated human silhouette. The shape of a contour is viewed as a point on the shape space of closed curves and, hence, each gesture is characterized and modeled as a trajectory on this shape space. We utilize the Riemannian geometry of this space to propose a template-based and a graphical-based approaches for modeling these trajectories. The two models are designed in such a way to account for the different invariance requirements in gesture recognition, and also capture the statistical variations associated with the contour data
- …