1,943 research outputs found
Non-Asymptotic Analysis of Tangent Space Perturbation
Constructing an efficient parameterization of a large, noisy data set of
points lying close to a smooth manifold in high dimension remains a fundamental
problem. One approach consists in recovering a local parameterization using the
local tangent plane. Principal component analysis (PCA) is often the tool of
choice, as it returns an optimal basis in the case of noise-free samples from a
linear subspace. To process noisy data samples from a nonlinear manifold, PCA
must be applied locally, at a scale small enough such that the manifold is
approximately linear, but at a scale large enough such that structure may be
discerned from noise. Using eigenspace perturbation theory and non-asymptotic
random matrix theory, we study the stability of the subspace estimated by PCA
as a function of scale, and bound (with high probability) the angle it forms
with the true tangent space. By adaptively selecting the scale that minimizes
this bound, our analysis reveals an appropriate scale for local tangent plane
recovery. We also introduce a geometric uncertainty principle quantifying the
limits of noise-curvature perturbation for stable recovery. With the purpose of
providing perturbation bounds that can be used in practice, we propose plug-in
estimates that make it possible to directly apply the theoretical results to
real data sets.Comment: 53 pages. Revised manuscript with new content addressing application
of results to real data set
Manifold Learning in Medical Imaging
Manifold learning theory has seen a surge of interest in the modeling of large and extensive datasets in medical imaging since they capture the essence of data in a way that fundamentally outperforms linear methodologies, the purpose of which is to essentially describe things that are flat. This problematic is particularly relevant with medical imaging data, where linear techniques are frequently unsuitable for capturing variations in anatomical structures. In many cases, there is enough structure in the data (CT, MRI, ultrasound) so a lower dimensional object can describe the degrees of freedom, such as in a manifold structure. Still, complex, multivariate distributions tend to demonstrate highly variable structural topologies that are impossible to capture with a single manifold learning algorithm. This chapter will present recent techniques developed in manifold theory for medical imaging analysis, to allow for statistical organ shape modeling, image segmentation and registration from the concept of navigation of manifolds, classification, as well as disease prediction models based on discriminant manifolds. We will present the theoretical basis of these works, with illustrative results on their applications from various organs and pathologies, including neurodegenerative diseases and spinal deformities
Novel methods for Intrinsic dimension estimation and manifold learning
One of the most challenging problems in modern science is how to deal with
the huge amount of data that today's technologies provide. Several diculties may
arise. For instance, the number of samples may be too big and the stream of
incoming data may be faster than the algorithm needed to process them. Another
common problem is that when data dimension grows also the volume of the space
does, leading to a sparsication of the available data. This may cause problems
in the statistical analysis since the data needed to support our conclusion often
grows exponentially with the dimension. This problem is commonly referred to
as the Curse of Dimensionality and it is one of the reasons why high dimensional
data can not be analyzed eciently with traditional methods. Classical methods
for dimensionality reduction, like principal component analysis and factor analysis,
may fail due to a nonlinear structure of the data. In recent years several methods
for nonlinear dimensionality reduction have been proposed. A general way to model
high dimensional data set is to represent the observations as noisy samples drawn
from a probability distribution mu in the real coordinate space of D dimensions. It has been observed that the essential
support of mu can be often well approximated by low dimensional sets. These sets
can be assumed to be low dimensional manifolds embedded in the ambient dimension
D. A manifold is a topologial space which globally may not be Euclidean but in
a small neighbor of each point behaves like an Euclidean space. In this setting we
call intrinsic dimension the dimension of the manifold, which is usually much lower
than the ambient dimension D. Roughly speaking, the intrinsic dimension of a data set can be described as the
minimum number of variables needed to represent the data without signicant loss
of information. In this work we propose dierent methods aimed at estimate the
intrinsic dimension. The rst method we present models the neighbors of each point
as stochastic processes, in such a way that a closed form likelihood function can
be written. This leads to a closed form maximum likelihood estimator (MLE) for
the intrinsic dimension, which has all the good features that a MLE can have. The
second method is based on a multiscale singular value decomposition (MSVD) of the
data. This method performs singular value decomposition (SVD) on neighbors of
increasing size and nd an estimate for the intrinsic dimension studying the behavior of the singular values as the radius of the neighbor increases. We also introduce
an algorithm to estimate the model parameters when the data are assumed to be
sampled around an unknown number of planes with dierent intrinsic dimensions,
embedded in a high dimensional space. This kind of models have many applications
in computer vision and patter recognition, where the data can be described by multiple linear structures or need to be clusterized into groups that can be represented
by low dimensional hyperplanes. The algorithm relies on both MSVD and spectral
clustering, and it is able to estimate the number of planes, their dimension as well
as their arrangement in the ambient space. Finally, we propose a novel method for
manifold reconstruction based on a multiscale approach, which approximates the
manifold from coarse to ne scales with increasing precision. The basic idea is to
produce, at a generic scale j, a piecewise linear approximation of the manifold using
a collection of low dimensional planes and use those planes to create clusters for
the data. At scale j + 1, each cluster is independently approximated by another
collection of low dimensional planes.The process is iterated until the desired precision
is achieved. This algorithm is fast because it is highly parallelizable and its
computational time is independent from the sample size. Moreover this method automatically
constructs a tree structure for the data. This feature can be particularly
useful in applications which requires an a priori tree data structure. The aim of the
collection of methods proposed in this work is to provide algorithms to learn and
estimate the underlying structure of high dimensional dataset
Robust Surface Reconstruction from Point Clouds
The problem of generating a surface triangulation from a set of points with normal information arises in several mesh processing tasks like surface reconstruction or surface resampling. In this paper we present a surface triangulation approach which is based on local 2d delaunay triangulations in tangent space. Our contribution is the extension of this method to surfaces with sharp corners and creases. We demonstrate the robustness of the method on difficult meshing problems that include nearby sheets, self intersecting non manifold surfaces and noisy point samples
- …