21 research outputs found
Minimax Estimation of Distances on a Surface and Minimax Manifold Learning in the Isometric-to-Convex Setting
We start by considering the problem of estimating intrinsic distances on a
smooth surface. We show that sharper estimates can be obtained via a
reconstruction of the surface, and discuss the use of the tangential Delaunay
complex for that purpose. We further show that the resulting approximation rate
is in fact optimal in an information-theoretic (minimax) sense. We then turn to
manifold learning and argue that a variant of Isomap where the distances are
instead computed on a reconstructed surface is minimax optimal for the problem
of isometric manifold embedding
On boundary detection
Given a sample of a random variable supported by a smooth compact manifold
, we propose a test to decide whether the boundary of
is empty or not with no preliminary support estimation. The test statistic
is based on the maximal distance between a sample point and the average of its
-nearest neighbors. We prove that the level of the test can be estimated,
that, with probability one, its power is one for large enough, and that
there exists a consistent decision rule. Heuristics for choosing a convenient
value for the parameter and identifying observations close to the
boundary are also given. We provide a simulation study of the test
Topological Data Analysis
International audienceIt has been observed since a long time that data are often carrying interesting topological and geometric structures. Characterizing such structures and providing efficient tools to infer and exploit them is a challenging problem that asks for new mathematics and that is motivated by a real need from applications. This paper is an introduction to Topological Data Analysis (), a new field that emerged during the last two decades with the objective of understanding and exploiting the topological structure of modern and complex data. The paper surveys some important mathematical and algorithmic developments in as well as software solutions that are currently used to address various applied and industrial problems
The bottleneck degree of algebraic varieties
A bottleneck of a smooth algebraic variety is a pair
of distinct points such that the Euclidean normal spaces at
and contain the line spanned by and . The narrowness of bottlenecks
is a fundamental complexity measure in the algebraic geometry of data. In this
paper we study the number of bottlenecks of affine and projective varieties,
which we call the bottleneck degree. The bottleneck degree is a measure of the
complexity of computing all bottlenecks of an algebraic variety, using for
example numerical homotopy methods. We show that the bottleneck degree is a
function of classical invariants such as Chern classes and polar classes. We
give the formula explicitly in low dimension and provide an algorithm to
compute it in the general case.Comment: Major revision. New introduction. Added some new illustrative lemmas
and figures. Added pseudocode for the algorithm to compute bottleneck degree.
Fixed some typo
Estimation minimax adaptative en inférence géométrique
International audienceWe focus on the problem of manifold estimation: given a set of observations sampled close to some unknown submanifold M , one wants to recover information about the geometry of M . Minimax estimators which have been proposed so far all depend crucially on the a priori knowledge of parameters quantifying the underlying distribution generating the sample (such as bounds on its density), whereas those quantities will be unknown in practice. Our contribution to the matter is twofold. First, we introduce a one-parameter family of manifold estimators (M t) tâ„0 based on a localized version of convex hulls, and show that for some choice of t, the corresponding estimator is minimax on the class of models of C 2 manifolds introduced in [Genovese et al., Manifold estimation and singular deconvolution under Hausdorff loss]. Second, we propose a completely data-driven selection procedure for the parameter t, leading to a minimax adaptive manifold estimator on this class of models. This selection procedure actually allows us to recover the Hausdorff distance between the set of observations and M , and can therefore be used as a scale parameter in other settings, such as tangent space estimation
Estimating the Reach of a Manifold via its Convexity Defect Function
The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-â204 (2019)], an estimator for the reach is given. A uniform expected loss bound over a C^k model is found. Lower bounds for the minimax rate for estimating the reach over these models are also provided. The estimator almost achieves these rates in the C^3 and C^4 cases, with a gap given by a logarithmic factor
Estimating the Convex Hull of the Image of a Set with Smooth Boundary: Error Bounds and Applications
We study the problem of estimating the convex hull of the image
of a compact set with smooth
boundary through a smooth function . Assuming
that is a submersion, we derive a new bound on the Hausdorff distance
between the convex hull of and the convex hull of the images of
sampled inputs on the boundary of . When applied to the problem of
geometric inference from a random sample, our results give tighter and more
general error bounds than the state of the art. We present applications to the
problems of robust optimization, of reachability analysis of dynamical systems,
and of robust trajectory optimization under bounded uncertainty.Comment: The error bound in Theorem 1.1 is tighter in this revisio
Estimation via length-constrained generalized empirical principal curves under small noise
In this paper, we propose a method to build a sequence of generalized empirical principal curves, with selected length, so that, in Hausdor distance, the images of the estimating principal curves converge in probability to the image of g
A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis
In this work, we analyze an efficient sampling-based algorithm for
general-purpose reachability analysis, which remains a notoriously challenging
problem with applications ranging from neural network verification to safety
analysis of dynamical systems. By sampling inputs, evaluating their images in
the true reachable set, and taking their -padded convex hull as a set
estimator, this algorithm applies to general problem settings and is simple to
implement. Our main contribution is the derivation of asymptotic and
finite-sample accuracy guarantees using random set theory. This analysis
informs algorithmic design to obtain an -close reachable set
approximation with high probability, provides insights into which reachability
problems are most challenging, and motivates safety-critical applications of
the technique. On a neural network verification task, we show that this
approach is more accurate and significantly faster than prior work. Informed by
our analysis, we also design a robust model predictive controller that we
demonstrate in hardware experiments