Search CORE

1,195 research outputs found

Bagging ensemble selection for regression

Author: D.H. Wolpert
E. Bauer
J. Demšar
J.H. Friedman
J.H. Friedman
L. Breiman
L. Rokach
Q. Sun
R. Bryll
Z.-H. Zhou
Z.H. Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classiﬁcation problems have shown that using random trees as base classiﬁers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classiﬁcation, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles

Crossref

Research Commons@Waikato

A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

Author: Pandey Gaurav
Whalen Sean
Publication venue
Publication date: 19/09/2013
Field of study

The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013 International Conference on Data Minin

arXiv.org e-Print Archive

Crossref

Bagging ensemble selection

Author: Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Ensemble selection has recently appeared as a popular ensemble learning method, not only because its implementation is fairly straightforward, but also due to its excellent predictive performance on practical problems. The method has been highlighted in winning solutions of many data mining competitions, such as the Netix competition, the KDD Cup 2009 and 2010, the UCSD FICO contest 2010, and a number of data mining competitions on the Kaggle platform. In this paper we present a novel variant: bagging ensemble selection. Three variations of the proposed algorithm are compared to the original ensemble selection algorithm and other ensemble algorithms. Experiments with ten real world problems from diverse domains demonstrate the benefit of the bagging ensemble selection algorithm

Research Commons@Waikato

GA-stacking: Evolutionary stacked generalization

Author: Aler Ricardo
Borrajo Millán Daniel
Ledezma Espino Agapito Ismael
Sanchis de Miguel María Araceli
Publication venue: 'IOS Press'
Publication date: 01/01/2010
Field of study

Stacking is a widely used technique for combining classiﬁers and improving prediction accuracy. Early research in Stacking showed that selecting the right classiﬁers, their parameters and the meta-classiﬁers was a critical issue. Most of the research on this topic hand picks the right combination of classiﬁers and their parameters. Instead of starting from these initial strong assumptions, our approach uses genetic algorithms to search for good Stacking conﬁgurations. Since this can lead to overﬁtting, one of the goals of this paper is to empirically evaluate the overall efﬁciency of the approach. A second goal is to compare our approach with the current best Stacking building techniques. The results show that our approach ﬁnds Stacking conﬁgurations that, in the worst case, perform as well as the best techniques, with the advantage of not having to manually set up the structure of the Stacking system.This work has been partially supported by the Spanish MCyT under projects TRA2007-67374-C02-02 and TIN-2005-08818-C04. Also, it has been supported under MEC grant by TIN2005-08945-C06-05. We thank anonymous reviewers for their helpful comments.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Heuristic search-based stacking of classifiers

Author: Aler Ricardo
Borrajo Millán Daniel
Ledezma Espino Agapito Ismael
Publication venue: Idea Group Publishing
Publication date: 01/01/2002
Field of study

Currently, the combination of several classifiers is one of the most activefields within inductive learning. Examples of such techniques are boost-ing, bagging and stacking. From these three techniques, stacking isperhaps the least used one. One of the main reasons for this relates to thedifficulty to define and parameterize its components: selecting whichcombination of base classifiers to use, and which classifiers to use as themeta-classifier. The approach we present in this chapter poses thisproblem as an optimization task, and then uses optimization techniquesbased on heuristic search to solve it. In particular, we apply geneticalgorithms to automatically obtain the ideal combination of learningmethods for the stacking system

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Empirical investigation of decision tree ensembles for monitoring cardiac complications of diabetes

Author: Abawajy Jemal
Jelinek Herbert F
Kelarev Andrei V
Stranieri Andrew
Publication venue: 'IGI Global'
Publication date: 01/01/2013
Field of study

Cardiac complications of diabetes require continuous monitoring since they may lead to increased morbidity or sudden death of patients. In order to monitor clinical complications of diabetes using wearable sensors, a small set of features have to be identified and effective algorithms for their processing need to be investigated. This article focuses on detecting and monitoring cardiac autonomic neuropathy (CAN) in diabetes patients. The authors investigate and compare the effectiveness of classifiers based on the following decision trees: ADTree, J48, NBTree, RandomTree, REPTree, and SimpleCart. The authors perform a thorough study comparing these decision trees as well as several decision tree ensembles created by applying the following ensemble methods: AdaBoost, Bagging, Dagging, Decorate, Grading, MultiBoost, Stacking, and two multi-level combinations of AdaBoost and MultiBoost with Bagging for the processing of data from diabetes patients for pervasive health monitoring of CAN. This paper concentrates on the particular task of applying decision tree ensembles for the detection and monitoring of cardiac autonomic neuropathy using these features. Experimental outcomes presented here show that the authors' application of the decision tree ensembles for the detection and monitoring of CAN in diabetes patients achieved better performance parameters compared with the results obtained previously in the literature

Deakin Research Online

Federation ResearchOnline

Improvement of alzheimer disease diagnosis accuracy using ensemble methods

Author: Al-Fairouz Ebtehal Ibrahim
Al-Hagery Mohammed Abdullah
Al-Humaidan Norah Ahmed
Publication venue: IAES Indonesia Section
Publication date: 26/03/2020
Field of study

Nowadays, there is a significant increase in the medical data that we should take advantage of that. The application of the machine learning via the data mining processes, such as data classification depends on using a single classification algorithm or those complained as ensemble models. The objective of this work is to improve the classification accuracy of previous results for Alzheimer disease diagnosing. The Decision Tree algorithm with three types of ensemble methods combined, which are Boosting, Bagging and Stacking. The clinical dataset from the Open Access Series of Imaging Studies (OASIS) was used in the experiments. The experimental results of the proposed approach were better than the previous work results. Where the Random Forest (Bagging) achieved the highest accuracy among all algorithms with 90.69%, while the lowest one was Stacking with 79.07%. All these results generated in this paper are higher in accuracy than that done before

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)