Search CORE

25 research outputs found

Maximum Margin Multiclass Nearest Neighbors

Author: Kontorovich Aryeh
Weiss Roi
Publication venue
Publication date: 01/01/2014
Field of study

We develop a general framework for margin-based multicategory classification in metric spaces. The basic work-horse is a margin-regularized version of the nearest-neighbor classifier. We prove generalization bounds that match the state of the art in sample size

n

and significantly improve the dependence on the number of classes

k

. Our point of departure is a nearly Bayes-optimal finite-sample risk bound independent of

k

. Although

k

-free, this bound is unregularized and non-adaptive, which motivates our main result: Rademacher and scale-sensitive margin bounds with a logarithmic dependence on

k

. As the best previous risk estimates in this setting were of order

\sqrt k

, our bound is exponentially sharper. From the algorithmic standpoint, in doubling metric spaces our classifier may be trained on

n

examples in

O(n^2\log n)

time and evaluated on new points in

O(\log n)

time

arXiv.org e-Print Archive

CiteSeerX

Soft Methodology for Cost-and-error Sensitive Classification

Author: Jan Te-Kang
Lin Chi-Hung
Lin Hsuan-Tien
Wang Da-Wei
Publication venue
Publication date: 25/10/2017
Field of study

Many real-world data mining applications need varying cost for different types of classification errors and thus call for cost-sensitive classification algorithms. Existing algorithms for cost-sensitive classification are successful in terms of minimizing the cost, but can result in a high error rate as the trade-off. The high error rate holds back the practical use of those algorithms. In this paper, we propose a novel cost-sensitive classification methodology that takes both the cost and the error rate into account. The methodology, called soft cost-sensitive classification, is established from a multicriteria optimization problem of the cost and the error rate, and can be viewed as regularizing cost-sensitive classification with the error rate. The simple methodology allows immediate improvements of existing cost-sensitive classification algorithms. Experiments on the benchmark and the real-world data sets show that our proposed methodology indeed achieves lower test error rates and similar (sometimes lower) test costs than existing cost-sensitive classification algorithms. We also demonstrate that the methodology can be extended for considering the weighted error rate instead of the original error rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1

arXiv.org e-Print Archive

CiteSeerX

Reduction Scheme for Empirical Risk Minimization and Its Applications to Multiple-Instance Learning

Author: Suehiro Daiki
Takimoto Eiji
Publication venue
Publication date: 05/06/2020
Field of study

In this paper, we propose a simple reduction scheme for empirical risk minimization (ERM) that preserves empirical Rademacher complexity. The reduction allows us to transfer known generalization bounds and algorithms for ERM to the target learning problems in a straightforward way. In particular, we apply our reduction scheme to the multiple-instance learning (MIL) problem, for which generalization bounds and ERM algorithms have been extensively studied. We show that various learning problems can be reduced to MIL. Examples include top-1 ranking learning, multi-class learning, and labeled and complementarily labeled learning. It turns out that, some of the generalization bounds derived are, despite the simplicity of derivation, incomparable or competitive with the existing bounds. Moreover, in some setting of labeled and complementarily labeled learning, the algorithm derived is the first polynomial-time algorithm

arXiv.org e-Print Archive