12,414 research outputs found
Active Nearest-Neighbor Learning in Metric Spaces
We propose a pool-based non-parametric active learning algorithm for general
metric spaces, called MArgin Regularized Metric Active Nearest Neighbor
(MARMANN), which outputs a nearest-neighbor classifier. We give prediction
error guarantees that depend on the noisy-margin properties of the input
sample, and are competitive with those obtained by previously proposed passive
learners. We prove that the label complexity of MARMANN is significantly lower
than that of any passive learner with similar error guarantees. MARMANN is
based on a generalized sample compression scheme, and a new label-efficient
active model-selection procedure
Theoretical analysis of cross-validation for estimating the risk of the k-Nearest Neighbor classifier
The present work aims at deriving theoretical guaranties on the behavior of
some cross-validation procedures applied to the -nearest neighbors (NN)
rule in the context of binary classification. Here we focus on the
leave--out cross-validation (LO) used to assess the performance of the
NN classifier. Remarkably this LO estimator can be efficiently computed
in this context using closed-form formulas derived by
\cite{CelisseMaryHuard11}. We describe a general strategy to derive moment and
exponential concentration inequalities for the LO estimator applied to the
NN classifier. Such results are obtained first by exploiting the connection
between the LO estimator and U-statistics, and second by making an intensive
use of the generalized Efron-Stein inequality applied to the LO estimator.
One other important contribution is made by deriving new quantifications of the
discrepancy between the LO estimator and the classification error/risk of
the NN classifier. The optimality of these bounds is discussed by means of
several lower bounds as well as simulation experiments
- …