954 research outputs found

    Adaptivity to Noise Parameters in Nonparametric Active Learning

    Full text link
    This work addresses various open questions in the theory of active learning for nonparametric classification. Our contributions are both statistical and algorithmic: -We establish new minimax-rates for active learning under common \textit{noise conditions}. These rates display interesting transitions -- due to the interaction between noise \textit{smoothness and margin} -- not present in the passive setting. Some such transitions were previously conjectured, but remained unconfirmed. -We present a generic algorithmic strategy for adaptivity to unknown noise smoothness and margin; our strategy achieves optimal rates in many general situations; furthermore, unlike in previous work, we avoid the need for \textit{adaptive confidence sets}, resulting in strictly milder distributional requirements

    Optimal Calibration for Multiple Testing against Local Inhomogeneity in Higher Dimension

    Full text link
    Based on two independent samples X_1,...,X_m and X_{m+1},...,X_n drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information about the multivariate densities such as compact support, strict positivity or smoothness and shape properties. The properly adjusted multiple testing procedure is shown to be sharp-optimal for typical arrangements of the observation values which appear with probability close to one. The proof relies on a new coupling Bernstein type exponential inequality, reflecting the non-subgaussian tail behavior of a combinatorial process. For power investigation of the proposed method a reparametrized minimax set-up is introduced, reducing the composite hypothesis "p=q" to a simple one with the multivariate mixed density (m/n)p+(1-m/n)q as infinite dimensional nuisance parameter. Within this framework, the test is shown to be spatially and sharply asymptotically adaptive with respect to uniform loss on isotropic H\"older classes. The exact minimax risk asymptotics are obtained in terms of solutions of the optimal recovery

    Bandwidth selection in kernel empirical risk minimization via the gradient

    Get PDF
    In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results are stated in the context of smooth loss functions, where the gradient of the risk appears as a good criterion to measure the performance of our estimators. The selection rule consists of a comparison of gradient empirical risks. It can be viewed as a nontrivial improvement of the so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one main advantage of our selection rule is the nondependency on the Hessian matrix of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Global and Local Two-Sample Tests via Regression

    Full text link
    Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature, there have been recent methodological developments such as classification accuracy tests. The goal of this work is to present a regression approach to comparing multivariate distributions of complex data. Depending on the chosen regression model, our framework can efficiently handle different types of variables and various structures in the data, with competitive power under many practical scenarios. Whereas previous work has been largely limited to global tests which conceal much of the local information, our approach naturally leads to a local two-sample testing framework in which we identify local differences between multivariate distributions with statistical confidence. We demonstrate the efficacy of our approach both theoretically and empirically, under some well-known parametric and nonparametric regression methods. Our proposed methods are applied to simulated data as well as a challenging astronomy data set to assess their practical usefulness

    Optimal cross-validation in density estimation with the L2L^2-loss

    Full text link
    We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-pp-out CV procedure (Lpo), where pp denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon VV-fold cross-validation in terms of variability and computational complexity. From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with p=1p=1, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size nn, optimality is achieved for pp large enough [with p/n=o(1)p/n=o(1)] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as p/np/n is conveniently related to the rate of convergence of the best estimator in the collection: (i) p/n→1p/n\to1 as n→+∞n\to+\infty with a parametric rate, and (ii) p/n=o(1)p/n=o(1) with some nonparametric estimators. These theoretical results are validated by simulation experiments.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1240 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Locally adaptive image denoising by a statistical multiresolution criterion

    Full text link
    We demonstrate how one can choose the smoothing parameter in image denoising by a statistical multiresolution criterion, both globally and locally. Using inhomogeneous diffusion and total variation regularization as examples for localized regularization schemes, we present an efficient method for locally adaptive image denoising. As expected, the smoothing parameter serves as an edge detector in this framework. Numerical examples illustrate the usefulness of our approach. We also present an application in confocal microscopy
    • …