Search CORE

611 research outputs found

Fuzzy rough and evolutionary approaches to instance selection

Author: Verbiest Nele
Publication venue: Ghent University. Faculty of Sciences
Publication date: 01/01/2014
Field of study

EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data

Author: Alcalá-Fdez
Alpaydin
Barua
Batista
Blaszczynski
Breiman
Cano
Castro
Chawla
Chris Cornelis
Cover
Das
Datta
Demšar
Díez-Pastor
Fawcett
Friedman
Galar
García
García
García
García
García-Pedrajas
Hand
He
Hido
Isaac Triguero
Khoshgoftaar
Kononenko
Krawczyk
Krawczyk
Kuncheva
Lee
Lin
López
López
Neri
Pawlak
Ramentol
Sarah Vluymans
Schapire
Seiffert
Storn
Ting
Triguero
Triguero
Triguero
Triguero
Wang
Wilson
Wilson
Yijing
Yu
Yule
Yvan Saeys
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Classification problems with an imbalanced class distribution have received an increased amount of attention within the machine learning community over the last decade. They are encountered in a growing number of real-world situations and pose a challenge to standard machine learning techniques. We propose a new hybrid method specifically tailored to handle class imbalance, called EPRENNID. It performs an evolutionary prototype reduction focused on providing diverse solutions to prevent the method from overfitting the training set. It also allows us to explicitly reduce the underrepresented class, which the most common preprocessing solutions handling class imbalance usually protect. As part of the experimental study, we show that the proposed prototype reduction method outperforms state-of-the-art preprocessing techniques. The preprocessing step yields multiple prototype sets that are later used in an ensemble, performing a weighted voting scheme with the nearest neighbor classifier. EPRENNID is experimentally shown to significantly outperform previous proposals

Instance selection improves geometric mean accuracy: A study on imbalanced data classification

Author: Arnaiz-Gonzalez Alvar
Diez-Pastor J.F.
Gunn Iain
Kuncheva Ludmila
Publication venue
Publication date: 06/02/2019
Field of study

Multiple proportion case-basing driven CBRE and its application in the evaluation of possible failure of firms

Author: Andina de la Fuente Diego
Li Hui
Sun Jie
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

Case-based reasoning (CBR) is a unique tool for the evaluation of possible failure of firms (EOPFOF) for its eases of interpretation and implementation. Ensemble computing, a variation of group decision in society, provides a potential means of improving predictive performance of CBR-based EOPFOF. This research aims to integrate bagging and proportion case-basing with CBR to generate a method of proportion bagging CBR for EOPFOF. Diverse multiple case bases are first produced by multiple case-basing, in which a volume parameter is introduced to control the size of each case base. Then, the classic case retrieval algorithm is implemented to generate diverse member CBR predictors. Majority voting, the most frequently used mechanism in ensemble computing, is finally used to aggregate outputs of member CBR predictors in order to produce final prediction of the CBR ensemble. In an empirical experiment, we statistically validated the results of the CBR ensemble from multiple case bases by comparing them with those of multivariate discriminant analysis, logistic regression, classic CBR, the best member CBR predictor and bagging CBR ensemble. The results from Chinese EOPFOF prior to 3 years indicate that the new CBR ensemble, which significantly improved CBRs predictive ability, outperformed all the comparative methods

Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: Experimental evaluation and support vector analysis

Author: Cornelis Chris
Derrac Joaquin
Garcia Salvador
Herrera Francisco
Verbiest Nele
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Similarity-based and Iterative Label Noise Filters for Monotonic Classification

Author: Cano José Ramón
García Salvador
Luengo Julian
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2020
Field of study

Monotonic ordinal classification has received an increasing interest in the latest years. Building monotone models from these problems usually requires datasets that verify monotonic relationships among the samples. When the monotonic relationships are not met, changing the labels may be a viable option, but the risk is high: wrong label changes would completely change the information contained in the data. In this work, we tackle the construction of monotone datasets by removing the wrong or noisy examples that violate monotonicity restrictions. We propose two monotonic noise filtering algorithms to preprocess the ordinal datasets and improve the monotonic relations between instances. The experiments are carried out over eleven ordinal datasets, showing that the application of the proposed filters improve the prediction capabilities over different levels of noise

AIS Electronic Library (AISeL)

On the class overlap problem in imbalanced data classification.

Author: Elyan Eyad
Petrovski Andrei
Vuttipittayamongkol Pattaramon
Publication venue: 'Elsevier BV'
Publication date: 27/11/2020
Field of study

Class imbalance is an active research area in the machine learning community. However, existing and recent literature showed that class overlap had a higher negative impact on the performance of learning algorithms. This paper provides detailed critical discussion and objective evaluation of class overlap in the context of imbalanced data and its impact on classification accuracy. First, we present a thorough experimental comparison of class overlap and class imbalance. Unlike previous work, our experiment was carried out on the full scale of class overlap and an extreme range of class imbalance degrees. Second, we provide an in-depth critical technical review of existing approaches to handle imbalanced datasets. Existing solutions from selective literature are critically reviewed and categorised as class distribution-based and class overlap-based methods. Emerging techniques and the latest development in this area are also discussed in detail. Experimental results in this paper are consistent with existing literature and show clearly that the performance of the learning algorithm deteriorates across varying degrees of class overlap whereas class imbalance does not always have an effect. The review emphasises the need for further research towards handling class overlap in imbalanced datasets to effectively improve learning algorithms’ performance

Open Access Institutional Repository at Robert Gordon University