1,943 research outputs found

    Non-Asymptotic Analysis of Tangent Space Perturbation

    Full text link
    Constructing an efficient parameterization of a large, noisy data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach consists in recovering a local parameterization using the local tangent plane. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data samples from a nonlinear manifold, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. Using eigenspace perturbation theory and non-asymptotic random matrix theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals an appropriate scale for local tangent plane recovery. We also introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for stable recovery. With the purpose of providing perturbation bounds that can be used in practice, we propose plug-in estimates that make it possible to directly apply the theoretical results to real data sets.Comment: 53 pages. Revised manuscript with new content addressing application of results to real data set

    Manifold Learning in Medical Imaging

    Get PDF
    Manifold learning theory has seen a surge of interest in the modeling of large and extensive datasets in medical imaging since they capture the essence of data in a way that fundamentally outperforms linear methodologies, the purpose of which is to essentially describe things that are flat. This problematic is particularly relevant with medical imaging data, where linear techniques are frequently unsuitable for capturing variations in anatomical structures. In many cases, there is enough structure in the data (CT, MRI, ultrasound) so a lower dimensional object can describe the degrees of freedom, such as in a manifold structure. Still, complex, multivariate distributions tend to demonstrate highly variable structural topologies that are impossible to capture with a single manifold learning algorithm. This chapter will present recent techniques developed in manifold theory for medical imaging analysis, to allow for statistical organ shape modeling, image segmentation and registration from the concept of navigation of manifolds, classification, as well as disease prediction models based on discriminant manifolds. We will present the theoretical basis of these works, with illustrative results on their applications from various organs and pathologies, including neurodegenerative diseases and spinal deformities

    Novel methods for Intrinsic dimension estimation and manifold learning

    Get PDF
    One of the most challenging problems in modern science is how to deal with the huge amount of data that today's technologies provide. Several diculties may arise. For instance, the number of samples may be too big and the stream of incoming data may be faster than the algorithm needed to process them. Another common problem is that when data dimension grows also the volume of the space does, leading to a sparsication of the available data. This may cause problems in the statistical analysis since the data needed to support our conclusion often grows exponentially with the dimension. This problem is commonly referred to as the Curse of Dimensionality and it is one of the reasons why high dimensional data can not be analyzed eciently with traditional methods. Classical methods for dimensionality reduction, like principal component analysis and factor analysis, may fail due to a nonlinear structure of the data. In recent years several methods for nonlinear dimensionality reduction have been proposed. A general way to model high dimensional data set is to represent the observations as noisy samples drawn from a probability distribution mu in the real coordinate space of D dimensions. It has been observed that the essential support of mu can be often well approximated by low dimensional sets. These sets can be assumed to be low dimensional manifolds embedded in the ambient dimension D. A manifold is a topologial space which globally may not be Euclidean but in a small neighbor of each point behaves like an Euclidean space. In this setting we call intrinsic dimension the dimension of the manifold, which is usually much lower than the ambient dimension D. Roughly speaking, the intrinsic dimension of a data set can be described as the minimum number of variables needed to represent the data without signicant loss of information. In this work we propose dierent methods aimed at estimate the intrinsic dimension. The rst method we present models the neighbors of each point as stochastic processes, in such a way that a closed form likelihood function can be written. This leads to a closed form maximum likelihood estimator (MLE) for the intrinsic dimension, which has all the good features that a MLE can have. The second method is based on a multiscale singular value decomposition (MSVD) of the data. This method performs singular value decomposition (SVD) on neighbors of increasing size and nd an estimate for the intrinsic dimension studying the behavior of the singular values as the radius of the neighbor increases. We also introduce an algorithm to estimate the model parameters when the data are assumed to be sampled around an unknown number of planes with dierent intrinsic dimensions, embedded in a high dimensional space. This kind of models have many applications in computer vision and patter recognition, where the data can be described by multiple linear structures or need to be clusterized into groups that can be represented by low dimensional hyperplanes. The algorithm relies on both MSVD and spectral clustering, and it is able to estimate the number of planes, their dimension as well as their arrangement in the ambient space. Finally, we propose a novel method for manifold reconstruction based on a multiscale approach, which approximates the manifold from coarse to ne scales with increasing precision. The basic idea is to produce, at a generic scale j, a piecewise linear approximation of the manifold using a collection of low dimensional planes and use those planes to create clusters for the data. At scale j + 1, each cluster is independently approximated by another collection of low dimensional planes.The process is iterated until the desired precision is achieved. This algorithm is fast because it is highly parallelizable and its computational time is independent from the sample size. Moreover this method automatically constructs a tree structure for the data. This feature can be particularly useful in applications which requires an a priori tree data structure. The aim of the collection of methods proposed in this work is to provide algorithms to learn and estimate the underlying structure of high dimensional dataset

    Robust Surface Reconstruction from Point Clouds

    Get PDF
    The problem of generating a surface triangulation from a set of points with normal information arises in several mesh processing tasks like surface reconstruction or surface resampling. In this paper we present a surface triangulation approach which is based on local 2d delaunay triangulations in tangent space. Our contribution is the extension of this method to surfaces with sharp corners and creases. We demonstrate the robustness of the method on difficult meshing problems that include nearby sheets, self intersecting non manifold surfaces and noisy point samples
    • …
    corecore