12,414 research outputs found

    Active Nearest-Neighbor Learning in Metric Spaces

    Full text link
    We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classifier. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is significantly lower than that of any passive learner with similar error guarantees. MARMANN is based on a generalized sample compression scheme, and a new label-efficient active model-selection procedure

    Theoretical analysis of cross-validation for estimating the risk of the k-Nearest Neighbor classifier

    Full text link
    The present work aims at deriving theoretical guaranties on the behavior of some cross-validation procedures applied to the kk-nearest neighbors (kkNN) rule in the context of binary classification. Here we focus on the leave-pp-out cross-validation (LppO) used to assess the performance of the kkNN classifier. Remarkably this LppO estimator can be efficiently computed in this context using closed-form formulas derived by \cite{CelisseMaryHuard11}. We describe a general strategy to derive moment and exponential concentration inequalities for the LppO estimator applied to the kkNN classifier. Such results are obtained first by exploiting the connection between the LppO estimator and U-statistics, and second by making an intensive use of the generalized Efron-Stein inequality applied to the L11O estimator. One other important contribution is made by deriving new quantifications of the discrepancy between the LppO estimator and the classification error/risk of the kkNN classifier. The optimality of these bounds is discussed by means of several lower bounds as well as simulation experiments
    • …
    corecore