12,143 research outputs found
Covariance integral invariants of embedded Riemannian manifolds for manifold learning
2018 Summer.Includes bibliographical references.This thesis develops an effective theoretical foundation for the integral invariant approach to study submanifold geometry via the statistics of the underlying point-set, i.e., Manifold Learning from covariance analysis. We perform Principal Component Analysis over a domain determined by the intersection of an embedded Riemannian manifold with spheres or cylinders of varying scale in ambient space, in order to generalize to arbitrary dimension the relationship between curvature and the eigenvalue decomposition of covariance matrices. In the case of regular curves in general dimension, the covariance eigenvectors converge to the Frenet-Serret frame and the corresponding eigenvalues have ratios that asymptotically determine the generalized curvatures completely, up to a constant that we determine by proving a recursion relation for a certain sequence of Hankel determinants. For hypersurfaces, the eigenvalue decomposition has series expansion given in terms of the dimension and the principal curvatures, where the eigenvectors converge to the Darboux frame of principal and normal directions. In the most general case of embedded Riemannian manifolds, the eigenvalues and limit eigenvectors of the covariance matrices are found to have asymptotic behavior given in terms of the curvature information encoded by the third fundamental form of the manifold, a classical tensor that we generalize to arbitrary dimension, and which is related to the Weingarten map and Ricci operator. These results provide descriptors at scale for the principal curvatures and, in turn, for the second fundamental form and the Riemann curvature tensor of a submanifold, which can serve to perform multi-scale Geometry Processing and Manifold Learning, making use of the advantages of the integral invariant viewpoint when only a discrete sample of points is available
An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data
We provide a probabilistic and infinitesimal view of how the principal
component analysis procedure (PCA) can be generalized to analysis of nonlinear
manifold valued data. Starting with the probabilistic PCA interpretation of the
Euclidean PCA procedure, we show how PCA can be generalized to manifolds in an
intrinsic way that does not resort to linearization of the data space. The
underlying probability model is constructed by mapping a Euclidean stochastic
process to the manifold using stochastic development of Euclidean
semimartingales. The construction uses a connection and bundles of covariant
tensors to allow global transport of principal eigenvectors, and the model is
thereby an example of how principal fiber bundles can be used to handle the
lack of global coordinate system and orientations that characterizes manifold
valued statistics. We show how curvature implies non-integrability of the
equivalent of Euclidean principal subspaces, and how the stochastic flows
provide an alternative to explicit construction of such subspaces. We describe
estimation procedures for inference of parameters and prediction of principal
components, and we give examples of properties of the model on embedded
surfaces
Non-Asymptotic Analysis of Tangent Space Perturbation
Constructing an efficient parameterization of a large, noisy data set of
points lying close to a smooth manifold in high dimension remains a fundamental
problem. One approach consists in recovering a local parameterization using the
local tangent plane. Principal component analysis (PCA) is often the tool of
choice, as it returns an optimal basis in the case of noise-free samples from a
linear subspace. To process noisy data samples from a nonlinear manifold, PCA
must be applied locally, at a scale small enough such that the manifold is
approximately linear, but at a scale large enough such that structure may be
discerned from noise. Using eigenspace perturbation theory and non-asymptotic
random matrix theory, we study the stability of the subspace estimated by PCA
as a function of scale, and bound (with high probability) the angle it forms
with the true tangent space. By adaptively selecting the scale that minimizes
this bound, our analysis reveals an appropriate scale for local tangent plane
recovery. We also introduce a geometric uncertainty principle quantifying the
limits of noise-curvature perturbation for stable recovery. With the purpose of
providing perturbation bounds that can be used in practice, we propose plug-in
estimates that make it possible to directly apply the theoretical results to
real data sets.Comment: 53 pages. Revised manuscript with new content addressing application
of results to real data set
Simplicial Nonlinear Principal Component Analysis
We present a new manifold learning algorithm that takes a set of data points
lying on or near a lower dimensional manifold as input, possibly with noise,
and outputs a simplicial complex that fits the data and the manifold. We have
implemented the algorithm in the case where the input data can be triangulated.
We provide triangulations of data sets that fall on the surface of a torus,
sphere, swiss roll, and creased sheet embedded in a fifty dimensional space. We
also discuss the theoretical justification of our algorithm.Comment: 21 pages, 6 figure
- …