21 research outputs found
Maximum Margin Multiclass Nearest Neighbors
We develop a general framework for margin-based multicategory classification
in metric spaces. The basic work-horse is a margin-regularized version of the
nearest-neighbor classifier. We prove generalization bounds that match the
state of the art in sample size and significantly improve the dependence on
the number of classes . Our point of departure is a nearly Bayes-optimal
finite-sample risk bound independent of . Although -free, this bound is
unregularized and non-adaptive, which motivates our main result: Rademacher and
scale-sensitive margin bounds with a logarithmic dependence on . As the best
previous risk estimates in this setting were of order , our bound is
exponentially sharper. From the algorithmic standpoint, in doubling metric
spaces our classifier may be trained on examples in time and
evaluated on new points in time
Multi-category classifiers and sample width
In a recent paper, the authors introduced the notion of sample width for binary classifier defined on the set of real numbers. It was shown that the performance of such classifier could be quantified in terms of this sample width. This paper considers how to adapt the idea of sample width so that it can be applied in cases where the classifier are multi-category and are defined on some arbitrary metric space
Generalization Bounds in the Predict-then-Optimize Framework
The predict-then-optimize framework is fundamental in many practical
settings: predict the unknown parameters of an optimization problem, and then
solve the problem using the predicted values of the parameters. A natural loss
function in this environment is to consider the cost of the decisions induced
by the predicted parameters, in contrast to the prediction error of the
parameters. This loss function was recently introduced in Elmachtoub and Grigas
(2017) and referred to as the Smart Predict-then-Optimize (SPO) loss. In this
work, we seek to provide bounds on how well the performance of a prediction
model fit on training data generalizes out-of-sample, in the context of the SPO
loss. Since the SPO loss is non-convex and non-Lipschitz, standard results for
deriving generalization bounds do not apply.
We first derive bounds based on the Natarajan dimension that, in the case of
a polyhedral feasible region, scale at most logarithmically in the number of
extreme points, but, in the case of a general convex feasible region, have
linear dependence on the decision dimension. By exploiting the structure of the
SPO loss function and a key property of the feasible region, which we denote as
the strength property, we can dramatically improve the dependence on the
decision and feature dimensions. Our approach and analysis rely on placing a
margin around problematic predictions that do not yield unique optimal
solutions, and then providing generalization bounds in the context of a
modified margin SPO loss function that is Lipschitz continuous. Finally, we
characterize the strength property and show that the modified SPO loss can be
computed efficiently for both strongly convex bodies and polytopes with an
explicit extreme point representation.Comment: Preliminary version in NeurIPS 201