21 research outputs found

    Minimax Estimation of Distances on a Surface and Minimax Manifold Learning in the Isometric-to-Convex Setting

    Full text link
    We start by considering the problem of estimating intrinsic distances on a smooth surface. We show that sharper estimates can be obtained via a reconstruction of the surface, and discuss the use of the tangential Delaunay complex for that purpose. We further show that the resulting approximation rate is in fact optimal in an information-theoretic (minimax) sense. We then turn to manifold learning and argue that a variant of Isomap where the distances are instead computed on a reconstructed surface is minimax optimal for the problem of isometric manifold embedding

    On boundary detection

    Get PDF
    Given a sample of a random variable supported by a smooth compact manifold M⊂RdM\subset \mathbb{R}^d, we propose a test to decide whether the boundary of MM is empty or not with no preliminary support estimation. The test statistic is based on the maximal distance between a sample point and the average of its knk_n-nearest neighbors. We prove that the level of the test can be estimated, that, with probability one, its power is one for nn large enough, and that there exists a consistent decision rule. Heuristics for choosing a convenient value for the knk_n parameter and identifying observations close to the boundary are also given. We provide a simulation study of the test

    Topological Data Analysis

    Get PDF
    International audienceIt has been observed since a long time that data are often carrying interesting topological and geometric structures. Characterizing such structures and providing efficient tools to infer and exploit them is a challenging problem that asks for new mathematics and that is motivated by a real need from applications. This paper is an introduction to Topological Data Analysis (), a new field that emerged during the last two decades with the objective of understanding and exploiting the topological structure of modern and complex data. The paper surveys some important mathematical and algorithmic developments in as well as software solutions that are currently used to address various applied and industrial problems

    The bottleneck degree of algebraic varieties

    Full text link
    A bottleneck of a smooth algebraic variety X⊂CnX \subset \mathbb{C}^n is a pair of distinct points (x,y)∈X(x,y) \in X such that the Euclidean normal spaces at xx and yy contain the line spanned by xx and yy. The narrowness of bottlenecks is a fundamental complexity measure in the algebraic geometry of data. In this paper we study the number of bottlenecks of affine and projective varieties, which we call the bottleneck degree. The bottleneck degree is a measure of the complexity of computing all bottlenecks of an algebraic variety, using for example numerical homotopy methods. We show that the bottleneck degree is a function of classical invariants such as Chern classes and polar classes. We give the formula explicitly in low dimension and provide an algorithm to compute it in the general case.Comment: Major revision. New introduction. Added some new illustrative lemmas and figures. Added pseudocode for the algorithm to compute bottleneck degree. Fixed some typo

    Estimation minimax adaptative en inférence géométrique

    Get PDF
    International audienceWe focus on the problem of manifold estimation: given a set of observations sampled close to some unknown submanifold M , one wants to recover information about the geometry of M . Minimax estimators which have been proposed so far all depend crucially on the a priori knowledge of parameters quantifying the underlying distribution generating the sample (such as bounds on its density), whereas those quantities will be unknown in practice. Our contribution to the matter is twofold. First, we introduce a one-parameter family of manifold estimators (M t) t≄0 based on a localized version of convex hulls, and show that for some choice of t, the corresponding estimator is minimax on the class of models of C 2 manifolds introduced in [Genovese et al., Manifold estimation and singular deconvolution under Hausdorff loss]. Second, we propose a completely data-driven selection procedure for the parameter t, leading to a minimax adaptive manifold estimator on this class of models. This selection procedure actually allows us to recover the Hausdorff distance between the set of observations and M , and can therefore be used as a scale parameter in other settings, such as tangent space estimation

    Estimating the Reach of a Manifold via its Convexity Defect Function

    Get PDF
    The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-–204 (2019)], an estimator for the reach is given. A uniform expected loss bound over a C^k model is found. Lower bounds for the minimax rate for estimating the reach over these models are also provided. The estimator almost achieves these rates in the C^3 and C^4 cases, with a gap given by a logarithmic factor

    Estimating the Convex Hull of the Image of a Set with Smooth Boundary: Error Bounds and Applications

    Full text link
    We study the problem of estimating the convex hull of the image f(X)⊂Rnf(X)\subset\mathbb{R}^n of a compact set X⊂RmX\subset\mathbb{R}^m with smooth boundary through a smooth function f:Rm→Rnf:\mathbb{R}^m\to\mathbb{R}^n. Assuming that ff is a submersion, we derive a new bound on the Hausdorff distance between the convex hull of f(X)f(X) and the convex hull of the images f(xi)f(x_i) of MM sampled inputs xix_i on the boundary of XX. When applied to the problem of geometric inference from a random sample, our results give tighter and more general error bounds than the state of the art. We present applications to the problems of robust optimization, of reachability analysis of dynamical systems, and of robust trajectory optimization under bounded uncertainty.Comment: The error bound in Theorem 1.1 is tighter in this revisio

    Estimation via length-constrained generalized empirical principal curves under small noise

    Get PDF
    In this paper, we propose a method to build a sequence of generalized empirical principal curves, with selected length, so that, in Hausdor distance, the images of the estimating principal curves converge in probability to the image of g

    A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis

    Full text link
    In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their Ï”\epsilon-padded convex hull as a set estimator, this algorithm applies to general problem settings and is simple to implement. Our main contribution is the derivation of asymptotic and finite-sample accuracy guarantees using random set theory. This analysis informs algorithmic design to obtain an Ï”\epsilon-close reachable set approximation with high probability, provides insights into which reachability problems are most challenging, and motivates safety-critical applications of the technique. On a neural network verification task, we show that this approach is more accurate and significantly faster than prior work. Informed by our analysis, we also design a robust model predictive controller that we demonstrate in hardware experiments
    corecore