33 research outputs found

    Manifold estimation and singular deconvolution under Hausdorff loss

    Get PDF
    We find lower and upper bounds for the risk of estimating a manifold in Hausdorff distance under several models. We also show that there are close connections between manifold estimation and the problem of deconvolving a singular measure.Comment: Published in at http://dx.doi.org/10.1214/12-AOS994 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Estimation minimax adaptative en inférence géométrique

    Get PDF
    International audienceWe focus on the problem of manifold estimation: given a set of observations sampled close to some unknown submanifold M , one wants to recover information about the geometry of M . Minimax estimators which have been proposed so far all depend crucially on the a priori knowledge of parameters quantifying the underlying distribution generating the sample (such as bounds on its density), whereas those quantities will be unknown in practice. Our contribution to the matter is twofold. First, we introduce a one-parameter family of manifold estimators (M t) t≥0 based on a localized version of convex hulls, and show that for some choice of t, the corresponding estimator is minimax on the class of models of C 2 manifolds introduced in [Genovese et al., Manifold estimation and singular deconvolution under Hausdorff loss]. Second, we propose a completely data-driven selection procedure for the parameter t, leading to a minimax adaptive manifold estimator on this class of models. This selection procedure actually allows us to recover the Hausdorff distance between the set of observations and M , and can therefore be used as a scale parameter in other settings, such as tangent space estimation

    Remember the Curse of Dimensionality: The Case of Goodness-of-Fit Testing in Arbitrary Dimension

    Full text link
    Despite a substantial literature on nonparametric two-sample goodness-of-fit testing in arbitrary dimensions spanning decades, there is no mention there of any curse of dimensionality. Only more recently Ramdas et al. (2015) have discussed this issue in the context of kernel methods by showing that their performance degrades with the dimension even when the underlying distributions are isotropic Gaussians. We take a minimax perspective and follow in the footsteps of Ingster (1987) to derive the minimax rate in arbitrary dimension when the discrepancy is measured in the L2 metric. That rate is revealed to be nonparametric and exhibit a prototypical curse of dimensionality. We further extend Ingster's work to show that the chi-squared test achieves the minimax rate. Moreover, we show that the test can be made to work when the distributions have support of low intrinsic dimension. Finally, inspired by Ingster (2000), we consider a multiscale version of the chi-square test which can adapt to unknown smoothness and/or unknown intrinsic dimensionality without much loss in power.Comment: This version comes after the publication of the paper in the Journal of Nonparametric Statistics. The main change is to cite the work of Ramdas et al. Some very minor typos were also correcte

    Optimal rates of convergence for persistence diagrams in Topological Data Analysis

    Full text link
    Computational topology has recently known an important development toward data analysis, giving birth to the field of topological data analysis. Topological persistence, or persistent homology, appears as a fundamental tool in this field. In this paper, we study topological persistence in general metric spaces, with a statistical approach. We show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties. Some numerical experiments are performed in various contexts to illustrate our results

    Nonparametric ridge estimation

    Full text link
    We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore