124 research outputs found
Handling Imbalanced Classification Problems With Support Vector Machines via Evolutionary Bilevel Optimization
Support vector machines (SVMs) are popular learning algorithms to deal with
binary classification problems. They traditionally assume equal
misclassification costs for each class; however, real-world problems may have
an uneven class distribution. This article introduces EBCS-SVM: evolutionary
bilevel cost-sensitive SVMs. EBCS-SVM handles imbalanced classification
problems by simultaneously learning the support vectors and optimizing the SVM
hyperparameters, which comprise the kernel parameter and misclassification
costs. The resulting optimization problem is a bilevel problem, where the lower
level determines the support vectors and the upper level the hyperparameters.
This optimization problem is solved using an evolutionary algorithm (EA) at the
upper level and sequential minimal optimization (SMO) at the lower level. These
two methods work in a nested fashion, that is, the optimal support vectors help
guide the search of the hyperparameters, and the lower level is initialized
based on previous successful solutions. The proposed method is assessed using
70 datasets of imbalanced classification and compared with several
state-of-the-art methods. The experimental results, supported by a Bayesian
test, provided evidence of the effectiveness of EBCS-SVM when working with
highly imbalanced datasets.Comment: Copyright 2022 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Soft Methodology for Cost-and-error Sensitive Classification
Many real-world data mining applications need varying cost for different
types of classification errors and thus call for cost-sensitive classification
algorithms. Existing algorithms for cost-sensitive classification are
successful in terms of minimizing the cost, but can result in a high error rate
as the trade-off. The high error rate holds back the practical use of those
algorithms. In this paper, we propose a novel cost-sensitive classification
methodology that takes both the cost and the error rate into account. The
methodology, called soft cost-sensitive classification, is established from a
multicriteria optimization problem of the cost and the error rate, and can be
viewed as regularizing cost-sensitive classification with the error rate. The
simple methodology allows immediate improvements of existing cost-sensitive
classification algorithms. Experiments on the benchmark and the real-world data
sets show that our proposed methodology indeed achieves lower test error rates
and similar (sometimes lower) test costs than existing cost-sensitive
classification algorithms. We also demonstrate that the methodology can be
extended for considering the weighted error rate instead of the original error
rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1
- …