3,342 research outputs found

    Exact oracle inequality for a sharp adaptive kernel density estimator

    Get PDF
    In one-dimensional density estimation on i.i.d. observations we suggest an adaptive cross-validation technique for the selection of a kernel estimator. This estimator is both asymptotic MISE-efficient with respect to the monotone oracle, and sharp minimax-adaptive over the whole scale of Sobolev spaces with smoothness index greater than 1/2. The proof of the central concentration inequality avoids "chaining" and relies on an additive decomposition of the empirical processes involved

    Random action of compact Lie groups and minimax estimation of a mean pattern

    Get PDF
    This paper considers the problem of estimating a mean pattern in the setting of Grenander's pattern theory. Shape variability in a data set of curves or images is modeled by the random action of elements in a compact Lie group on an infinite dimensional space. In the case of observations contaminated by an additive Gaussian white noise, it is shown that estimating a reference template in the setting of Grenander's pattern theory falls into the category of deconvolution problems over Lie groups. To obtain this result, we build an estimator of a mean pattern by using Fourier deconvolution and harmonic analysis on compact Lie groups. In an asymptotic setting where the number of observed curves or images tends to infinity, we derive upper and lower bounds for the minimax quadratic risk over Sobolev balls. This rate depends on the smoothness of the density of the random Lie group elements representing shape variability in the data, which makes a connection between estimating a mean pattern and standard deconvolution problems in nonparametric statistics

    Exact minimax risk for density estimators in non-integer Sobolev classes

    Get PDF
    The L_2L\_2-minimax risk in Sobolev classes of densities with non-integer smoothness index is shown to have an analog form to that in integer Sobolev classes. To this end, the notion of Sobolev classes is generalized to fractional derivatives of order βR+\beta\in\mathbb R^+. A minimax kernel density estimator for such a classes is found. Although there exists no corresponding proof in the literature so far, the result of this article was used implicitly in numerous papers. A certain necessity that this gap had to be filled, can thus not be denied

    Tight conditions for consistency of variable selection in the context of high dimensionality

    Get PDF
    We address the issue of variable selection in the regression model with very high ambient dimension, that is, when the number of variables is very large. The main focus is on the situation where the number of relevant variables, called intrinsic dimension, is much smaller than the ambient dimension d. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size. The procedure that is provably consistent under these tight conditions is based on comparing quadratic functionals of the empirical Fourier coefficients with appropriately chosen threshold values. The asymptotic analysis reveals the presence of two quite different re gimes. The first regime is when the intrinsic dimension is fixed. In this case the situation in nonparametric regression is the same as in linear regression, that is, consistent variable selection is possible if and only if log d is small compared to the sample size n. The picture is different in the second regime, that is, when the number of relevant variables denoted by s tends to infinity as nn\to\infty. Then we prove that consistent variable selection in nonparametric set-up is possible only if s+loglog d is small compared to log n. We apply these results to derive minimax separation rates for the problem of variableComment: arXiv admin note: text overlap with arXiv:1102.3616; Published in at http://dx.doi.org/10.1214/12-AOS1046 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Optimal rates of convergence for persistence diagrams in Topological Data Analysis

    Full text link
    Computational topology has recently known an important development toward data analysis, giving birth to the field of topological data analysis. Topological persistence, or persistent homology, appears as a fundamental tool in this field. In this paper, we study topological persistence in general metric spaces, with a statistical approach. We show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties. Some numerical experiments are performed in various contexts to illustrate our results
    corecore