25 research outputs found
Expanding the Family of Grassmannian Kernels: An Embedding Perspective
Modeling videos and image-sets as linear subspaces has proven beneficial for
many visual recognition tasks. However, it also incurs challenges arising from
the fact that linear subspaces do not obey Euclidean geometry, but lie on a
special type of Riemannian manifolds known as Grassmannian. To leverage the
techniques developed for Euclidean spaces (e.g, support vector machines) with
subspaces, several recent studies have proposed to embed the Grassmannian into
a Hilbert space by making use of a positive definite kernel. Unfortunately,
only two Grassmannian kernels are known, none of which -as we will show- is
universal, which limits their ability to approximate a target function
arbitrarily well. Here, we introduce several positive definite Grassmannian
kernels, including universal ones, and demonstrate their superiority over
previously-known kernels in various tasks, such as classification, clustering,
sparse coding and hashing
Log-Euclidean Bag of Words for Human Action Recognition
Representing videos by densely extracted local space-time features has
recently become a popular approach for analysing actions. In this paper, we
tackle the problem of categorising human actions by devising Bag of Words (BoW)
models based on covariance matrices of spatio-temporal features, with the
features formed from histograms of optical flow. Since covariance matrices form
a special type of Riemannian manifold, the space of Symmetric Positive Definite
(SPD) matrices, non-Euclidean geometry should be taken into account while
discriminating between covariance matrices. To this end, we propose to embed
SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW
approach to its Riemannian version. The proposed BoW approach takes into
account the manifold geometry of SPD matrices during the generation of the
codebook and histograms. Experiments on challenging human action datasets show
that the proposed method obtains notable improvements in discrimination
accuracy, in comparison to several state-of-the-art methods
The space of essential matrices as a Riemannian quotient manifold
The essential matrix, which encodes the epipolar constraint between points in two projective views,
is a cornerstone of modern computer vision. Previous works have proposed different characterizations
of the space of essential matrices as a Riemannian manifold. However, they either do not consider the
symmetric role played by the two views, or do not fully take into account the geometric peculiarities
of the epipolar constraint. We address these limitations with a characterization as a quotient manifold
which can be easily interpreted in terms of camera poses. While our main focus in on theoretical
aspects, we include applications to optimization problems in computer vision.This work was supported by grants NSF-IIP-0742304, NSF-OIA-1028009, ARL MAST-CTA W911NF-08-2-0004, and ARL RCTA W911NF-10-2-0016, NSF-DGE-0966142, and NSF-IIS-1317788
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
Quantization and clustering on Riemannian manifolds with an application to air traffic analysis
International audienceThe goal of quantization is to find the best approximation of a probability distribution by a discrete measure with finite support. When dealing with empirical distributions, this boils down to finding the best summary of the data by a smaller number of points, and automatically yields a k-means-type clustering. In this paper, we introduce Competitive Learning Riemannian Quantization (CLRQ), an online quantization algorithm that applies when the data does not belong to a vector space, but rather a Riemannian manifold. It can be seen as a density approximation procedure as well as a clustering method. Compared to many clustering algorihtms, it requires few distance computations, which is particularly computationally advantageous in the manifold setting. We prove its convergence and show simulated examples on the sphere and the hyperbolic plane. We also provide an application to real data by using CLRQ to create summaries of images of covariance matrices estimated from air traffic images. These summaries are representative of the air traffic complexity and yield clusterings of the airspaces into zones that are homogeneous with respect to that criterion. They can then be compared using discrete optimal transport and be further used as inputs of a machine learning algorithm or as indexes in a traffic database