research

ClusterOSS: a new undersampling method for imbalanced learning

Abstract

A dataset is said to be imbalanced when its classes are disproportionately represented in terms of the number of instances they contain. This problem is common in applications such as medical diagnosis of rare diseases, detection of fraudulent calls, signature recognition. In this paper we propose an alternative method for imbalanced learning, which balances the dataset using an undersampling strategy. We show that ClusterOSS outperforms OSS, which is the method ClusterOSS is based on. Moreover, we show that the results can be further improved by combining ClusterOSS with random oversampling.FAPESPCAPESCNP

    Similar works