6 research outputs found
Adaptive Ensemble of Classifiers with Regularization for Imbalanced Data Classification
The dynamic ensemble selection of classifiers is an effective approach for
processing label-imbalanced data classifications. However, such a technique is
prone to overfitting, owing to the lack of regularization methods and the
dependence of the aforementioned technique on local geometry. In this study,
focusing on binary imbalanced data classification, a novel dynamic ensemble
method, namely adaptive ensemble of classifiers with regularization (AER), is
proposed, to overcome the stated limitations. The method solves the overfitting
problem through implicit regularization. Specifically, it leverages the
properties of stochastic gradient descent to obtain the solution with the
minimum norm, thereby achieving regularization; furthermore, it interpolates
the ensemble weights by exploiting the global geometry of data to further
prevent overfitting. According to our theoretical proofs, the seemingly
complicated AER paradigm, in addition to its regularization capabilities, can
actually reduce the asymptotic time and memory complexities of several other
algorithms. We evaluate the proposed AER method on seven benchmark imbalanced
datasets from the UCI machine learning repository and one artificially
generated GMM-based dataset with five variations. The results show that the
proposed algorithm outperforms the major existing algorithms based on multiple
metrics in most cases, and two hypothesis tests (McNemar's and Wilcoxon tests)
verify the statistical significance further. In addition, the proposed method
has other preferred properties such as special advantages in dealing with
highly imbalanced data, and it pioneers the research on the regularization for
dynamic ensemble methods.Comment: Major revision; Change of authors due to contribution