1,305 research outputs found
Estimating the Reach of a Manifold
Various problems in manifold estimation make use of a quantity called the
reach, denoted by , which is a measure of the regularity of the
manifold. This paper is the first investigation into the problem of how to
estimate the reach. First, we study the geometry of the reach through an
approximation perspective. We derive new geometric results on the reach for
submanifolds without boundary. An estimator of is
proposed in a framework where tangent spaces are known, and bounds assessing
its efficiency are derived. In the case of i.i.d. random point cloud
, is showed to achieve uniform
expected loss bounds over a -like model. Finally, we obtain
upper and lower bounds on the minimax rate for estimating the reach
Non-Asymptotic Uniform Rates of Consistency for k-NN Regression
We derive high-probability finite-sample uniform rates of consistency for
-NN regression that are optimal up to logarithmic factors under mild
assumptions. We moreover show that -NN regression adapts to an unknown lower
intrinsic dimension automatically. We then apply the -NN regression rates to
establish new results about estimating the level sets and global maxima of a
function from noisy observations.Comment: In Proceedings of 33rd AAAI Conference on Artificial Intelligence
(AAAI 2019
Optimal rates of convergence for persistence diagrams in Topological Data Analysis
Computational topology has recently known an important development toward
data analysis, giving birth to the field of topological data analysis.
Topological persistence, or persistent homology, appears as a fundamental tool
in this field. In this paper, we study topological persistence in general
metric spaces, with a statistical approach. We show that the use of persistent
homology can be naturally considered in general statistical frameworks and
persistence diagrams can be used as statistics with interesting convergence
properties. Some numerical experiments are performed in various contexts to
illustrate our results
Sparse PCA: Optimal rates and adaptive estimation
Principal component analysis (PCA) is one of the most commonly used
statistical procedures with a wide range of applications. This paper considers
both minimax and adaptive estimation of the principal subspace in the high
dimensional setting. Under mild technical conditions, we first establish the
optimal rates of convergence for estimating the principal subspace which are
sharp with respect to all the parameters, thus providing a complete
characterization of the difficulty of the estimation problem in term of the
convergence rate. The lower bound is obtained by calculating the local metric
entropy and an application of Fano's lemma. The rate optimal estimator is
constructed using aggregation, which, however, might not be computationally
feasible. We then introduce an adaptive procedure for estimating the principal
subspace which is fully data driven and can be computed efficiently. It is
shown that the estimator attains the optimal rates of convergence
simultaneously over a large collection of the parameter spaces. A key idea in
our construction is a reduction scheme which reduces the sparse PCA problem to
a high-dimensional multivariate regression problem. This method is potentially
also useful for other related problems.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1178 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Minimax lower bounds for function estimation on graphs
We study minimax lower bounds for function estimation problems on large graph
when the target function is smoothly varying over the graph. We derive minimax
rates in the context of regression and classification problems on graphs that
satisfy an asymptotic shape assumption and with a smoothness condition on the
target function, both formulated in terms of the graph Laplacian
- …