1,305 research outputs found

    Estimating the Reach of a Manifold

    Get PDF
    Various problems in manifold estimation make use of a quantity called the reach, denoted by τ_M\tau\_M, which is a measure of the regularity of the manifold. This paper is the first investigation into the problem of how to estimate the reach. First, we study the geometry of the reach through an approximation perspective. We derive new geometric results on the reach for submanifolds without boundary. An estimator τ^\hat{\tau} of τ_M\tau\_{M} is proposed in a framework where tangent spaces are known, and bounds assessing its efficiency are derived. In the case of i.i.d. random point cloud X_n\mathbb{X}\_{n}, τ^(X_n)\hat{\tau}(\mathbb{X}\_{n}) is showed to achieve uniform expected loss bounds over a C3\mathcal{C}^3-like model. Finally, we obtain upper and lower bounds on the minimax rate for estimating the reach

    Non-Asymptotic Uniform Rates of Consistency for k-NN Regression

    Full text link
    We derive high-probability finite-sample uniform rates of consistency for kk-NN regression that are optimal up to logarithmic factors under mild assumptions. We moreover show that kk-NN regression adapts to an unknown lower intrinsic dimension automatically. We then apply the kk-NN regression rates to establish new results about estimating the level sets and global maxima of a function from noisy observations.Comment: In Proceedings of 33rd AAAI Conference on Artificial Intelligence (AAAI 2019

    Optimal rates of convergence for persistence diagrams in Topological Data Analysis

    Full text link
    Computational topology has recently known an important development toward data analysis, giving birth to the field of topological data analysis. Topological persistence, or persistent homology, appears as a fundamental tool in this field. In this paper, we study topological persistence in general metric spaces, with a statistical approach. We show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties. Some numerical experiments are performed in various contexts to illustrate our results

    Sparse PCA: Optimal rates and adaptive estimation

    Get PDF
    Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano's lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1178 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Minimax lower bounds for function estimation on graphs

    Get PDF
    We study minimax lower bounds for function estimation problems on large graph when the target function is smoothly varying over the graph. We derive minimax rates in the context of regression and classification problems on graphs that satisfy an asymptotic shape assumption and with a smoothness condition on the target function, both formulated in terms of the graph Laplacian
    corecore