519 research outputs found
Optimal Riemannian quantization with an application to air traffic analysis
The goal of optimal quantization is to find the best approximation of a
probability distribution by a discrete measure with finite support. When
dealing with empirical distributions, this boils down to finding the best
summary of the data by a smaller number of points, and automatically yields a
K-means-type clustering. In this paper, we introduce Competitive Learning
Riemannian Quantization (CLRQ), an online algorithm that computes the optimal
summary when the data does not belong to a vector space, but rather a
Riemannian manifold. We prove its convergence and show simulated examples on
the sphere and the hyperbolic plane. We also provide an application to real
data by using CLRQ to create summaries of images of covariance matrices
estimated from air traffic images. These summaries are representative of the
air traffic complexity and yield clusterings of the airspaces into zones that
are homogeneous with respect to that criterion. They can then be compared using
discrete optimal transport and be further used as inputs of a machine learning
algorithm or as indexes in a traffic database
Clustering in Hilbert simplex geometry
Clustering categorical distributions in the probability simplex is a
fundamental task met in many applications dealing with normalized histograms.
Traditionally, the differential-geometric structures of the probability simplex
have been used either by (i) setting the Riemannian metric tensor to the Fisher
information matrix of the categorical distributions, or (ii) defining the
dualistic information-geometric structure induced by a smooth dissimilarity
measure, the Kullback-Leibler divergence. In this work, we introduce for this
clustering task a novel computationally-friendly framework for modeling the
probability simplex termed {\em Hilbert simplex geometry}. In the Hilbert
simplex geometry, the distance function is described by a polytope. We discuss
the pros and cons of those different statistical modelings, and benchmark
experimentally these geometries for center-based -means and -center
clusterings. We show that Hilbert metric in the probability simplex satisfies
the property of information monotonicity. Furthermore, since a canonical
Hilbert metric distance can be defined on any bounded convex subset of the
Euclidean space, we also consider Hilbert's projective geometry of the
elliptope of correlation matrices and study its clustering performances.Comment: 42 page
Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices
Data encoded as symmetric positive definite (SPD) matrices frequently arise
in many areas of computer vision and machine learning. While these matrices
form an open subset of the Euclidean space of symmetric matrices, viewing them
through the lens of non-Euclidean Riemannian geometry often turns out to be
better suited in capturing several desirable data properties. However,
formulating classical machine learning algorithms within such a geometry is
often non-trivial and computationally expensive. Inspired by the great success
of dictionary learning and sparse coding for vector-valued data, our goal in
this paper is to represent data in the form of SPD matrices as sparse conic
combinations of SPD atoms from a learned dictionary via a Riemannian geometric
approach. To that end, we formulate a novel Riemannian optimization objective
for dictionary learning and sparse coding in which the representation loss is
characterized via the affine invariant Riemannian metric. We also present a
computationally simple algorithm for optimizing our model. Experiments on
several computer vision datasets demonstrate superior classification and
retrieval performance using our approach when compared to sparse coding via
alternative non-Riemannian formulations
Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods
Representing images and videos with Symmetric Positive Definite (SPD)
matrices, and considering the Riemannian geometry of the resulting space, has
been shown to yield high discriminative power in many visual recognition tasks.
Unfortunately, computation on the Riemannian manifold of SPD matrices
-especially of high-dimensional ones- comes at a high cost that limits the
applicability of existing techniques. In this paper, we introduce algorithms
able to handle high-dimensional SPD matrices by constructing a
lower-dimensional SPD manifold. To this end, we propose to model the mapping
from the high-dimensional SPD manifold to the low-dimensional one with an
orthonormal projection. This lets us formulate dimensionality reduction as the
problem of finding a projection that yields a low-dimensional manifold either
with maximum discriminative power in the supervised scenario, or with maximum
variance of the data in the unsupervised one. We show that learning can be
expressed as an optimization problem on a Grassmann manifold and discuss fast
solutions for special cases. Our evaluation on several classification tasks
evidences that our approach leads to a significant accuracy gain over
state-of-the-art methods.Comment: arXiv admin note: text overlap with arXiv:1407.112
Learning Mid-level Words on Riemannian Manifold for Action Recognition
Human action recognition remains a challenging task due to the various
sources of video data and large intra-class variations. It thus becomes one of
the key issues in recent research to explore effective and robust
representation to handle such challenges. In this paper, we propose a novel
representation approach by constructing mid-level words in videos and encoding
them on Riemannian manifold. Specifically, we first conduct a global alignment
on the densely extracted low-level features to build a bank of corresponding
feature groups, each of which can be statistically modeled as a mid-level word
lying on some specific Riemannian manifold. Based on these mid-level words, we
construct intrinsic Riemannian codebooks by employing K-Karcher-means
clustering and Riemannian Gaussian Mixture Model, and consequently extend the
Riemannian manifold version of three well studied encoding methods in Euclidean
space, i.e. Bag of Visual Words (BoVW), Vector of Locally Aggregated
Descriptors (VLAD), and Fisher Vector (FV), to obtain the final action video
representations. Our method is evaluated in two tasks on four popular realistic
datasets: action recognition on YouTube, UCF50, HMDB51 databases, and action
similarity labeling on ASLAN database. In all cases, the reported results
achieve very competitive performance with those most recent state-of-the-art
works.Comment: 10 page
Dictionary Learning and Sparse Coding on Statistical Manifolds
In this paper, we propose a novel information theoretic framework for
dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the
manifold of probability distributions). Unlike the traditional DL and SC
framework, our new formulation does not explicitly incorporate any sparsity
inducing norm in the cost function being optimized but yet yields sparse codes.
Our algorithm approximates the data points on the statistical manifold (which
are probability distributions) by the weighted Kullback-Leibeler center/mean
(KL-center) of the dictionary atoms. The KL-center is defined as the minimizer
of the maximum KL-divergence between itself and members of the set whose center
is being sought. Further, we prove that the weighted KL-center is a sparse
combination of the dictionary atoms. This result also holds for the case when
the KL-divergence is replaced by the well known Hellinger distance. From an
applications perspective, we present an extension of the aforementioned
framework to the manifold of symmetric positive definite matrices (which can be
identified with the manifold of zero mean gaussian distributions),
. We present experiments involving a variety of dictionary-based
reconstruction and classification problems in Computer Vision. Performance of
the proposed algorithm is demonstrated by comparing it to several
state-of-the-art methods in terms of reconstruction and classification accuracy
as well as sparsity of the chosen representation.Comment: arXiv admin note: substantial text overlap with arXiv:1604.0693
Image segmentation with superpixel-based covariance descriptors in low-rank representation
This paper investigates the problem of image segmentation using superpixels.
We propose two approaches to enhance the discriminative ability of the
superpixel's covariance descriptors. In the first one, we employ the
Log-Euclidean distance as the metric on the covariance manifolds, and then use
the RBF kernel to measure the similarities between covariance descriptors. The
second method is focused on extracting the subspace structure of the set of
covariance descriptors by extending a low rank representation algorithm on to
the covariance manifolds. Experiments are carried out with the Berkly
Segmentation Dataset, and compared with the state-of-the-art segmentation
algorithms, both methods are competitive.Comment: 7 pages, 2 figures, 1 tabl
A Novel Space-Time Representation on the Positive Semidefinite Con for Facial Expression Recognition
In this paper, we study the problem of facial expression recognition using a
novel space-time geometric representation. We describe the temporal evolution
of facial landmarks as parametrized trajectories on the Riemannian manifold of
positive semidefinite matrices of fixed-rank. Our representation has the
advantage to bring naturally a second desirable quantity when comparing shapes
-- the spatial covariance -- in addition to the conventional affine-shape
representation. We derive then geometric and computational tools for
rate-invariant analysis and adaptive re-sampling of trajectories, grounding on
the Riemannian geometry of the manifold. Specifically, our approach involves
three steps: 1) facial landmarks are first mapped into the Riemannian manifold
of positive semidefinite matrices of rank 2, to build time-parameterized
trajectories; 2) a temporal alignment is performed on the trajectories,
providing a geometry-aware (dis-)similarity measure between them; 3) finally,
pairwise proximity function SVM (ppfSVM) is used to classify them,
incorporating the latter (dis-)similarity measure into the kernel function. We
show the effectiveness of the proposed approach on four publicly available
benchmarks (CK+, MMI, Oulu-CASIA, and AFEW). The results of the proposed
approach are comparable to or better than the state-of-the-art methods when
involving only facial landmarks.Comment: To be appeared at ICCV 201
Graph Quantization
Vector quantization(VQ) is a lossy data compression technique from signal
processing, which is restricted to feature vectors and therefore inapplicable
for combinatorial structures. This contribution presents a theoretical
foundation of graph quantization (GQ) that extends VQ to the domain of
attributed graphs. We present the necessary Lloyd-Max conditions for optimality
of a graph quantizer and consistency results for optimal GQ design based on
empirical distortion measures and stochastic optimization. These results
statistically justify existing clustering algorithms in the domain of graphs.
The proposed approach provides a template of how to link structural pattern
recognition methods other than GQ to statistical pattern recognition.Comment: 24 pages; submitted to CVI
Deep manifold-to-manifold transforming network for action recognition
Symmetric positive definite (SPD) matrices (e.g., covariances, graph
Laplacians, etc.) are widely used to model the relationship of spatial or
temporal domain. Nevertheless, SPD matrices are theoretically embedded on
Riemannian manifolds. In this paper, we propose an end-to-end deep
manifold-to-manifold transforming network (DMT-Net) which can make SPD matrices
flow from one Riemannian manifold to another more discriminative one. To learn
discriminative SPD features characterizing both spatial and temporal
dependencies, we specifically develop three novel layers on manifolds: (i) the
local SPD convolutional layer, (ii) the non-linear SPD activation layer, and
(iii) the Riemannian-preserved recursive layer. The SPD property is preserved
through all layers without any requirement of singular value decomposition
(SVD), which is often used in the existing methods with expensive computation
cost. Furthermore, a diagonalizing SPD layer is designed to efficiently
calculate the final metric for the classification task. To evaluate our
proposed method, we conduct extensive experiments on the task of action
recognition, where input signals are popularly modeled as SPD matrices. The
experimental results demonstrate that our DMT-Net is much more competitive over
state-of-the-art
- …