4,426 research outputs found

    Shape Dimension and Intrinsic Metric from Samples of Manifolds

    Get PDF
    We introduce the adaptive neighborhood graph as a data structure for modeling a smooth manifold M embedded in some Euclidean space Rd. We assume that M is known to us only through a finite sample P \subset M, as is often the case in applications. The adaptive neighborhood graph is a geometric graph on P. Its complexity is at most \min{2^{O(k)n, n2}, where n = |P| and k = dim M, as opposed to the n\lceil d/2 \rceil complexity of the Delaunay triangulation, which is often used to model manifolds. We prove that we can correctly infer the connected components and the dimension of M from the adaptive neighborhood graph provided a certain standard sampling condition is fulfilled. The running time of the dimension detection algorithm is d2O(k^{7} log k) for each connected component of M. If the dimension is considered constant, this is a constant-time operation, and the adaptive neighborhood graph is of linear size. Moreover, the exponential dependence of the constants is only on the intrinsic dimension k, not on the ambient dimension d. This is of particular interest if the co-dimension is high, i.e., if k is much smaller than d, as is the case in many applications. The adaptive neighborhood graph also allows us to approximate the geodesic distances between the points in

    Empirical geodesic graphs and CAT(k) metrics for data analysis

    Full text link
    A methodology is developed for data analysis based on empirically constructed geodesic metric spaces. For a probability distribution, the length along a path between two points can be defined as the amount of probability mass accumulated along the path. The geodesic, then, is the shortest such path and defines a geodesic metric. Such metrics are transformed in a number of ways to produce parametrised families of geodesic metric spaces, empirical versions of which allow computation of intrinsic means and associated measures of dispersion. These reveal properties of the data, based on geometry, such as those that are difficult to see from the raw Euclidean distances. Examples of application include clustering and classification. For certain parameter ranges, the spaces become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal spanning tree of a graph based on the data becomes CAT(0). In another, a so-called "metric cone" construction allows extension to CAT(kk) spaces. It is shown how to empirically tune the parameters of the metrics, making it possible to apply them to a number of real cases.Comment: Statistics and Computing, 201

    Principal Component Analysis for Functional Data on Riemannian Manifolds and Spheres

    Full text link
    Functional data analysis on nonlinear manifolds has drawn recent interest. Sphere-valued functional data, which are encountered for example as movement trajectories on the surface of the earth, are an important special case. We consider an intrinsic principal component analysis for smooth Riemannian manifold-valued functional data and study its asymptotic properties. Riemannian functional principal component analysis (RFPCA) is carried out by first mapping the manifold-valued data through Riemannian logarithm maps to tangent spaces around the time-varying Fr\'echet mean function, and then performing a classical multivariate functional principal component analysis on the linear tangent spaces. Representations of the Riemannian manifold-valued functions and the eigenfunctions on the original manifold are then obtained with exponential maps. The tangent-space approximation through functional principal component analysis is shown to be well-behaved in terms of controlling the residual variation if the Riemannian manifold has nonnegative curvature. Specifically, we derive a central limit theorem for the mean function, as well as root-nn uniform convergence rates for other model components, including the covariance function, eigenfunctions, and functional principal component scores. Our applications include a novel framework for the analysis of longitudinal compositional data, achieved by mapping longitudinal compositional data to trajectories on the sphere, illustrated with longitudinal fruit fly behavior patterns. RFPCA is shown to be superior in terms of trajectory recovery in comparison to an unrestricted functional principal component analysis in applications and simulations and is also found to produce principal component scores that are better predictors for classification compared to traditional functional functional principal component scores

    Principal arc analysis on direct product manifolds

    Get PDF
    We propose a new approach to analyze data that naturally lie on manifolds. We focus on a special class of manifolds, called direct product manifolds, whose intrinsic dimension could be very high. Our method finds a low-dimensional representation of the manifold that can be used to find and visualize the principal modes of variation of the data, as Principal Component Analysis (PCA) does in linear spaces. The proposed method improves upon earlier manifold extensions of PCA by more concisely capturing important nonlinear modes. For the special case of data on a sphere, variation following nongeodesic arcs is captured in a single mode, compared to the two modes needed by previous methods. Several computational and statistical challenges are resolved. The development on spheres forms the basis of principal arc analysis on more complicated manifolds. The benefits of the method are illustrated by a data example using medial representations in image analysis.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS370 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Intrinsic Inference on the Mean Geodesic of Planar Shapes and Tree Discrimination by Leaf Growth

    Full text link
    For planar landmark based shapes, taking into account the non-Euclidean geometry of the shape space, a statistical test for a common mean first geodesic principal component (GPC) is devised. It rests on one of two asymptotic scenarios, both of which are identical in a Euclidean geometry. For both scenarios, strong consistency and central limit theorems are established, along with an algorithm for the computation of a Ziezold mean geodesic. In application, this allows to verify the geodesic hypothesis for leaf growth of Canadian black poplars and to discriminate genetically different trees by observations of leaf shape growth over brief time intervals. With a test based on Procrustes tangent space coordinates, not involving the shape space's curvature, neither can be achieved.Comment: 28 pages, 4 figure
    • …
    corecore