Search CORE

2,046 research outputs found

An adaptive nearest neighbor rule for classification

Author: Balsubramani Akshay
Dasgupta Sanjoy
Freund Yoav
Moran Shay
Publication venue
Publication date: 01/01/2019
Field of study

We introduce a variant of the

k

-nearest neighbor classifier in which

k

is chosen adaptively for each query, rather than supplied as a parameter. The choice of

k

depends on properties of each neighborhood, and therefore may significantly vary between different points. (For example, the algorithm will use larger

k

for predicting the labels of points in noisy regions.) We provide theory and experiments that demonstrate that the algorithm performs comparably to, and sometimes better than,

k

-NN with an optimal choice of

k

. In particular, we derive bounds on the convergence rates of our classifier that depend on a local quantity we call the `advantage' which is significantly weaker than the Lipschitz conditions used in previous convergence rate proofs. These generalization bounds hinge on a variant of the seminal Uniform Convergence Theorem due to Vapnik and Chervonenkis; this variant concerns conditional probabilities and may be of independent interest

arXiv.org e-Print Archive

eScholarship - University of California

Theoretical analysis of cross-validation for estimating the risk of the k-Nearest Neighbor classifier

Author: Celisse Alain
Mary-Huard Tristan
Publication venue
Publication date: 12/10/2017
Field of study

The present work aims at deriving theoretical guaranties on the behavior of some cross-validation procedures applied to the

k

-nearest neighbors (

k

NN) rule in the context of binary classification. Here we focus on the leave-

p

-out cross-validation (L

p

O) used to assess the performance of the

k

NN classifier. Remarkably this L

p

O estimator can be efficiently computed in this context using closed-form formulas derived by \cite{CelisseMaryHuard11}. We describe a general strategy to derive moment and exponential concentration inequalities for the L

p

O estimator applied to the

k

NN classifier. Such results are obtained first by exploiting the connection between the L

p

O estimator and U-statistics, and second by making an intensive use of the generalized Efron-Stein inequality applied to the L

1

O estimator. One other important contribution is made by deriving new quantifications of the discrepancy between the L

p

O estimator and the classification error/risk of the

k

NN classifier. The optimality of these bounds is discussed by means of several lower bounds as well as simulation experiments

arXiv.org e-Print Archive

HAL Descartes

ProdInra

Active Nearest-Neighbor Learning in Metric Spaces

Author: Kontorovich Aryeh
Sabato Sivan
Urner Ruth
Publication venue
Publication date: 01/06/2017
Field of study

We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classifier. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is significantly lower than that of any passive learner with similar error guarantees. MARMANN is based on a generalized sample compression scheme, and a new label-efficient active model-selection procedure

arXiv.org e-Print Archive

MPG.PuRe