4,426 research outputs found
Shape Dimension and Intrinsic Metric from Samples of Manifolds
We introduce the adaptive neighborhood graph as a data structure for modeling a smooth manifold M embedded in some Euclidean space Rd. We assume that M is known to us only through a finite sample P \subset M, as is often the case in applications. The adaptive neighborhood graph is a geometric graph on P. Its complexity is at most \min{2^{O(k)n, n2}, where n = |P| and k = dim M, as opposed to the n\lceil d/2 \rceil complexity of the Delaunay triangulation, which is often used to model manifolds. We prove that we can correctly infer the connected components and the dimension of M from the adaptive neighborhood graph provided a certain standard sampling condition is fulfilled. The running time of the dimension detection algorithm is d2O(k^{7} log k) for each connected component of M. If the dimension is considered constant, this is a constant-time operation, and the adaptive neighborhood graph is of linear size. Moreover, the exponential dependence of the constants is only on the intrinsic dimension k, not on the ambient dimension d. This is of particular interest if the co-dimension is high, i.e., if k is much smaller than d, as is the case in many applications. The adaptive neighborhood graph also allows us to approximate the geodesic distances between the points in
Empirical geodesic graphs and CAT(k) metrics for data analysis
A methodology is developed for data analysis based on empirically constructed
geodesic metric spaces. For a probability distribution, the length along a path
between two points can be defined as the amount of probability mass accumulated
along the path. The geodesic, then, is the shortest such path and defines a
geodesic metric. Such metrics are transformed in a number of ways to produce
parametrised families of geodesic metric spaces, empirical versions of which
allow computation of intrinsic means and associated measures of dispersion.
These reveal properties of the data, based on geometry, such as those that are
difficult to see from the raw Euclidean distances. Examples of application
include clustering and classification. For certain parameter ranges, the spaces
become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal
spanning tree of a graph based on the data becomes CAT(0). In another, a
so-called "metric cone" construction allows extension to CAT() spaces. It is
shown how to empirically tune the parameters of the metrics, making it possible
to apply them to a number of real cases.Comment: Statistics and Computing, 201
Principal Component Analysis for Functional Data on Riemannian Manifolds and Spheres
Functional data analysis on nonlinear manifolds has drawn recent interest.
Sphere-valued functional data, which are encountered for example as movement
trajectories on the surface of the earth, are an important special case. We
consider an intrinsic principal component analysis for smooth Riemannian
manifold-valued functional data and study its asymptotic properties. Riemannian
functional principal component analysis (RFPCA) is carried out by first mapping
the manifold-valued data through Riemannian logarithm maps to tangent spaces
around the time-varying Fr\'echet mean function, and then performing a
classical multivariate functional principal component analysis on the linear
tangent spaces. Representations of the Riemannian manifold-valued functions and
the eigenfunctions on the original manifold are then obtained with exponential
maps. The tangent-space approximation through functional principal component
analysis is shown to be well-behaved in terms of controlling the residual
variation if the Riemannian manifold has nonnegative curvature. Specifically,
we derive a central limit theorem for the mean function, as well as root-
uniform convergence rates for other model components, including the covariance
function, eigenfunctions, and functional principal component scores. Our
applications include a novel framework for the analysis of longitudinal
compositional data, achieved by mapping longitudinal compositional data to
trajectories on the sphere, illustrated with longitudinal fruit fly behavior
patterns. RFPCA is shown to be superior in terms of trajectory recovery in
comparison to an unrestricted functional principal component analysis in
applications and simulations and is also found to produce principal component
scores that are better predictors for classification compared to traditional
functional functional principal component scores
Principal arc analysis on direct product manifolds
We propose a new approach to analyze data that naturally lie on manifolds. We
focus on a special class of manifolds, called direct product manifolds, whose
intrinsic dimension could be very high. Our method finds a low-dimensional
representation of the manifold that can be used to find and visualize the
principal modes of variation of the data, as Principal Component Analysis (PCA)
does in linear spaces. The proposed method improves upon earlier manifold
extensions of PCA by more concisely capturing important nonlinear modes. For
the special case of data on a sphere, variation following nongeodesic arcs is
captured in a single mode, compared to the two modes needed by previous
methods. Several computational and statistical challenges are resolved. The
development on spheres forms the basis of principal arc analysis on more
complicated manifolds. The benefits of the method are illustrated by a data
example using medial representations in image analysis.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS370 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Intrinsic Inference on the Mean Geodesic of Planar Shapes and Tree Discrimination by Leaf Growth
For planar landmark based shapes, taking into account the non-Euclidean
geometry of the shape space, a statistical test for a common mean first
geodesic principal component (GPC) is devised. It rests on one of two
asymptotic scenarios, both of which are identical in a Euclidean geometry. For
both scenarios, strong consistency and central limit theorems are established,
along with an algorithm for the computation of a Ziezold mean geodesic. In
application, this allows to verify the geodesic hypothesis for leaf growth of
Canadian black poplars and to discriminate genetically different trees by
observations of leaf shape growth over brief time intervals. With a test based
on Procrustes tangent space coordinates, not involving the shape space's
curvature, neither can be achieved.Comment: 28 pages, 4 figure
- …