33 research outputs found
Manifold estimation and singular deconvolution under Hausdorff loss
We find lower and upper bounds for the risk of estimating a manifold in
Hausdorff distance under several models. We also show that there are close
connections between manifold estimation and the problem of deconvolving a
singular measure.Comment: Published in at http://dx.doi.org/10.1214/12-AOS994 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Estimation minimax adaptative en inférence géométrique
International audienceWe focus on the problem of manifold estimation: given a set of observations sampled close to some unknown submanifold M , one wants to recover information about the geometry of M . Minimax estimators which have been proposed so far all depend crucially on the a priori knowledge of parameters quantifying the underlying distribution generating the sample (such as bounds on its density), whereas those quantities will be unknown in practice. Our contribution to the matter is twofold. First, we introduce a one-parameter family of manifold estimators (M t) t≥0 based on a localized version of convex hulls, and show that for some choice of t, the corresponding estimator is minimax on the class of models of C 2 manifolds introduced in [Genovese et al., Manifold estimation and singular deconvolution under Hausdorff loss]. Second, we propose a completely data-driven selection procedure for the parameter t, leading to a minimax adaptive manifold estimator on this class of models. This selection procedure actually allows us to recover the Hausdorff distance between the set of observations and M , and can therefore be used as a scale parameter in other settings, such as tangent space estimation
Remember the Curse of Dimensionality: The Case of Goodness-of-Fit Testing in Arbitrary Dimension
Despite a substantial literature on nonparametric two-sample goodness-of-fit
testing in arbitrary dimensions spanning decades, there is no mention there of
any curse of dimensionality. Only more recently Ramdas et al. (2015) have
discussed this issue in the context of kernel methods by showing that their
performance degrades with the dimension even when the underlying distributions
are isotropic Gaussians. We take a minimax perspective and follow in the
footsteps of Ingster (1987) to derive the minimax rate in arbitrary dimension
when the discrepancy is measured in the L2 metric. That rate is revealed to be
nonparametric and exhibit a prototypical curse of dimensionality. We further
extend Ingster's work to show that the chi-squared test achieves the minimax
rate. Moreover, we show that the test can be made to work when the
distributions have support of low intrinsic dimension. Finally, inspired by
Ingster (2000), we consider a multiscale version of the chi-square test which
can adapt to unknown smoothness and/or unknown intrinsic dimensionality without
much loss in power.Comment: This version comes after the publication of the paper in the Journal
of Nonparametric Statistics. The main change is to cite the work of Ramdas et
al. Some very minor typos were also correcte
Optimal rates of convergence for persistence diagrams in Topological Data Analysis
Computational topology has recently known an important development toward
data analysis, giving birth to the field of topological data analysis.
Topological persistence, or persistent homology, appears as a fundamental tool
in this field. In this paper, we study topological persistence in general
metric spaces, with a statistical approach. We show that the use of persistent
homology can be naturally considered in general statistical frameworks and
persistence diagrams can be used as statistics with interesting convergence
properties. Some numerical experiments are performed in various contexts to
illustrate our results
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org