17 research outputs found
On the proliferation of support vectors in high dimensions
The support vector machine (SVM) is a well-established classification method
whose name refers to the particular training examples, called support vectors,
that determine the maximum margin separating hyperplane. The SVM classifier is
known to enjoy good generalization properties when the number of support
vectors is small compared to the number of training examples. However, recent
research has shown that in sufficiently high-dimensional linear classification
problems, the SVM can generalize well despite a proliferation of support
vectors where all training examples are support vectors. In this paper, we
identify new deterministic equivalences for this phenomenon of support vector
proliferation, and use them to (1) substantially broaden the conditions under
which the phenomenon occurs in high-dimensional settings, and (2) prove a
nearly matching converse result
Active Nearest-Neighbor Learning in Metric Spaces
We propose a pool-based non-parametric active learning algorithm for general
metric spaces, called MArgin Regularized Metric Active Nearest Neighbor
(MARMANN), which outputs a nearest-neighbor classifier. We give prediction
error guarantees that depend on the noisy-margin properties of the input
sample, and are competitive with those obtained by previously proposed passive
learners. We prove that the label complexity of MARMANN is significantly lower
than that of any passive learner with similar error guarantees. MARMANN is
based on a generalized sample compression scheme, and a new label-efficient
active model-selection procedure