1,083 research outputs found
An adaptive nearest neighbor rule for classification
We introduce a variant of the -nearest neighbor classifier in which is
chosen adaptively for each query, rather than supplied as a parameter. The
choice of depends on properties of each neighborhood, and therefore may
significantly vary between different points. (For example, the algorithm will
use larger for predicting the labels of points in noisy regions.)
We provide theory and experiments that demonstrate that the algorithm
performs comparably to, and sometimes better than, -NN with an optimal
choice of . In particular, we derive bounds on the convergence rates of our
classifier that depend on a local quantity we call the `advantage' which is
significantly weaker than the Lipschitz conditions used in previous convergence
rate proofs. These generalization bounds hinge on a variant of the seminal
Uniform Convergence Theorem due to Vapnik and Chervonenkis; this variant
concerns conditional probabilities and may be of independent interest
Classification with the nearest neighbor rule in general finite dimensional spaces: necessary and sufficient conditions
Given an -sample of random vectors whose
joint law is unknown, the long-standing problem of supervised classification
aims to \textit{optimally} predict the label of a given a new observation
. In this context, the nearest neighbor rule is a popular flexible and
intuitive method in non-parametric situations.
Even if this algorithm is commonly used in the machine learning and
statistics communities, less is known about its prediction ability in general
finite dimensional spaces, especially when the support of the density of the
observations is . This paper is devoted to the study of the
statistical properties of the nearest neighbor rule in various situations. In
particular, attention is paid to the marginal law of , as well as the
smoothness and margin properties of the \textit{regression function} . We identify two necessary and sufficient conditions to
obtain uniform consistency rates of classification and to derive sharp
estimates in the case of the nearest neighbor rule. Some numerical experiments
are proposed at the end of the paper to help illustrate the discussion.Comment: 53 Pages, 3 figure
Theoretical analysis of cross-validation for estimating the risk of the k-Nearest Neighbor classifier
The present work aims at deriving theoretical guaranties on the behavior of
some cross-validation procedures applied to the -nearest neighbors (NN)
rule in the context of binary classification. Here we focus on the
leave--out cross-validation (LO) used to assess the performance of the
NN classifier. Remarkably this LO estimator can be efficiently computed
in this context using closed-form formulas derived by
\cite{CelisseMaryHuard11}. We describe a general strategy to derive moment and
exponential concentration inequalities for the LO estimator applied to the
NN classifier. Such results are obtained first by exploiting the connection
between the LO estimator and U-statistics, and second by making an intensive
use of the generalized Efron-Stein inequality applied to the LO estimator.
One other important contribution is made by deriving new quantifications of the
discrepancy between the LO estimator and the classification error/risk of
the NN classifier. The optimality of these bounds is discussed by means of
several lower bounds as well as simulation experiments
Nonparametric Estimation of the Bayes Error
This thesis is concerned with the performance of nonparametric classifiers and their application to the estimation of the Rayes error. Although the behavior of these classifiers as the number of preclassified design samples becomes infinite is well understood, very little is known regarding their finite sample error performance. Here, we examine the performance of Parzen and k-nearest neighbor (k-NN) classifiers, relating the expected error rates to the size of the design set and the various, design parameters (kernel size and shape, value of k, distance metric for nearest neighbor calculation, etc.). These results lead to several significant improvements in the design procedures for nonparametric classifiers, as well as improved estimates of the Bayes error rate. , Our results show that increasing the sample size is in many cases not an effective practical means of improving the classifier performance. Rather, careful attention must be paid to the decision threshold, selection of the kernel size and shape (for Parzen classifiers), and selection of k and the distance metric (for k-NN classifiers). Guidelines are developed toward propper selection of each of these parameters. The use of nonparametric error rates for Bayes error estimation is also considered, and techniques are given which reduce or compensate for the biases of the nonparametric error rates. A bootstrap technique is also developed which allows the designer to estimate the standard deviation of a nonparametric estimate of the Bayes error
Least Ambiguous Set-Valued Classifiers with Bounded Error Levels
In most classification tasks there are observations that are ambiguous and
therefore difficult to correctly label. Set-valued classifiers output sets of
plausible labels rather than a single label, thereby giving a more appropriate
and informative treatment to the labeling of ambiguous instances. We introduce
a framework for multiclass set-valued classification, where the classifiers
guarantee user-defined levels of coverage or confidence (the probability that
the true label is contained in the set) while minimizing the ambiguity (the
expected size of the output). We first derive oracle classifiers assuming the
true distribution to be known. We show that the oracle classifiers are obtained
from level sets of the functions that define the conditional probability of
each class. Then we develop estimators with good asymptotic and finite sample
properties. The proposed estimators build on existing single-label classifiers.
The optimal classifier can sometimes output the empty set, but we provide two
solutions to fix this issue that are suitable for various practical needs.Comment: Final version to be published in the Journal of the American
Statistical Association at
https://www.tandfonline.com/doi/abs/10.1080/01621459.2017.1395341?journalCode=uasa2
Classification with the nearest neighbor rule in general finite dimensional spaces
Given an n-sample of random vectors (Xi,Yi)1=i=n whose joint law is unknown, the long-standing problem of supervised classification aims to optimally predict the label Y of a given new observation X. In this context, the k-nearest neighbor rule is a popular flexible and intuitive method in nonparametric situations. Even if this algorithm is commonly used in the machine learning and statistics communities, less is known about its prediction ability in general finite dimensional spaces, especially when the support of the density of the observations is Rd . This paper is devoted to the study of the statistical properties of the k-nearest neighbor rule in various situations. In particular, attention is paid to the marginal law of X, as well as the smoothness and margin properties of the regression function n(X) = E[Y |X]. We identify two necessary and sufficient conditions to obtain uniform consistency rates of classification and derive sharp estimates in the case of the k-nearest neighbor rule. Some numerical experiments are proposed at the end of the paper to help illustrate the discussio
- âŠ