12,143 research outputs found

    Covariance integral invariants of embedded Riemannian manifolds for manifold learning

    Get PDF
    2018 Summer.Includes bibliographical references.This thesis develops an effective theoretical foundation for the integral invariant approach to study submanifold geometry via the statistics of the underlying point-set, i.e., Manifold Learning from covariance analysis. We perform Principal Component Analysis over a domain determined by the intersection of an embedded Riemannian manifold with spheres or cylinders of varying scale in ambient space, in order to generalize to arbitrary dimension the relationship between curvature and the eigenvalue decomposition of covariance matrices. In the case of regular curves in general dimension, the covariance eigenvectors converge to the Frenet-Serret frame and the corresponding eigenvalues have ratios that asymptotically determine the generalized curvatures completely, up to a constant that we determine by proving a recursion relation for a certain sequence of Hankel determinants. For hypersurfaces, the eigenvalue decomposition has series expansion given in terms of the dimension and the principal curvatures, where the eigenvectors converge to the Darboux frame of principal and normal directions. In the most general case of embedded Riemannian manifolds, the eigenvalues and limit eigenvectors of the covariance matrices are found to have asymptotic behavior given in terms of the curvature information encoded by the third fundamental form of the manifold, a classical tensor that we generalize to arbitrary dimension, and which is related to the Weingarten map and Ricci operator. These results provide descriptors at scale for the principal curvatures and, in turn, for the second fundamental form and the Riemann curvature tensor of a submanifold, which can serve to perform multi-scale Geometry Processing and Manifold Learning, making use of the advantages of the integral invariant viewpoint when only a discrete sample of points is available

    An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data

    Full text link
    We provide a probabilistic and infinitesimal view of how the principal component analysis procedure (PCA) can be generalized to analysis of nonlinear manifold valued data. Starting with the probabilistic PCA interpretation of the Euclidean PCA procedure, we show how PCA can be generalized to manifolds in an intrinsic way that does not resort to linearization of the data space. The underlying probability model is constructed by mapping a Euclidean stochastic process to the manifold using stochastic development of Euclidean semimartingales. The construction uses a connection and bundles of covariant tensors to allow global transport of principal eigenvectors, and the model is thereby an example of how principal fiber bundles can be used to handle the lack of global coordinate system and orientations that characterizes manifold valued statistics. We show how curvature implies non-integrability of the equivalent of Euclidean principal subspaces, and how the stochastic flows provide an alternative to explicit construction of such subspaces. We describe estimation procedures for inference of parameters and prediction of principal components, and we give examples of properties of the model on embedded surfaces

    Non-Asymptotic Analysis of Tangent Space Perturbation

    Full text link
    Constructing an efficient parameterization of a large, noisy data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach consists in recovering a local parameterization using the local tangent plane. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data samples from a nonlinear manifold, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. Using eigenspace perturbation theory and non-asymptotic random matrix theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals an appropriate scale for local tangent plane recovery. We also introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for stable recovery. With the purpose of providing perturbation bounds that can be used in practice, we propose plug-in estimates that make it possible to directly apply the theoretical results to real data sets.Comment: 53 pages. Revised manuscript with new content addressing application of results to real data set

    Simplicial Nonlinear Principal Component Analysis

    Get PDF
    We present a new manifold learning algorithm that takes a set of data points lying on or near a lower dimensional manifold as input, possibly with noise, and outputs a simplicial complex that fits the data and the manifold. We have implemented the algorithm in the case where the input data can be triangulated. We provide triangulations of data sets that fall on the surface of a torus, sphere, swiss roll, and creased sheet embedded in a fifty dimensional space. We also discuss the theoretical justification of our algorithm.Comment: 21 pages, 6 figure
    corecore