Search CORE

191 research outputs found

Bagging ensemble selection for regression

Author: D.H. Wolpert
E. Bauer
J. Demšar
J.H. Friedman
J.H. Friedman
L. Breiman
L. Rokach
Q. Sun
R. Bryll
Z.-H. Zhou
Z.H. Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classiﬁcation problems have shown that using random trees as base classiﬁers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classiﬁcation, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles

Crossref

Research Commons@Waikato

classification of oncologic data with genetic programming

Author: Francesco Archetti
Ilaria Giordani
Leonardo Vanneschi
Mauro Castelli
Publication venue
Publication date: 12/08/2009
Field of study

Discovering the models explaining the hidden relationship between genetic material and tumor pathologies is one of the most important open challenges in biology and medicine. Given the large amount of data made available by the DNA Microarray technique, Machine Learning is becoming a popular tool for this kind of investigations. In the last few years, we have been particularly involved in the study of Genetic Programming for mining large sets of biomedical data. In this paper, we present a comparison between four variants of Genetic Programming for the classification of two different oncologic datasets: the first one contains data from healthy colon tissues and colon tissues affected by cancer; the second one contains data from patients affected by two kinds of leukemia (acute myeloid leukemia and acute lymphoblastic leukemia). We report experimental results obtained using two different fitness criteria: the receiver operating characteristic and the percentage of correctly classified instances. These results, and their comparison with the ones obtained by three nonevolutionary Machine Learning methods (Support Vector Machines, MultiBoosting, and Random Forests) on the same data, seem to hint that Genetic Programming is a promising technique for this kind of classification

Open Access Repository

Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

Author: Amthauer Heather A
Tsatsoulis Costas
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Article discussing research on classifying genes to the correct gene ontology slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

Crossref

Springer - Publisher Connector

PubMed Central

UNT Digital Library

Multiboosting for regression

Author: Nuno Miguel Rainho Valente
Publication venue
Publication date: 11/02/2020
Field of study

Repositório Aberto da Universidade do Porto

A Robust and Efficient Real Time Network Intrusion Detection System Using Artificial Neural Network In Data Mining

Author: Renuka Devi Thanasekaran
Publication venue
Publication date: 11/04/2020
Field of study

ABSTRAC

CiteSeerX

Decision trees and multi-level ensemble classifiers for neurological diagnostics

Author: Abawajy J
Chowdhury M
Jelinek H
Kelarev A
Stranieri A
Publication venue: 'American Institute of Mathematical Sciences (AIMS)'
Publication date: 01/06/2014
Field of study

Cardiac autonomic neuropathy (CAN) is a well known complication of diabetes leading to impaired regulation of blood pressure and heart rate, and increases the risk of cardiac associated mortality of diabetes patients. The neurological diagnostics of CAN progression is an important problem that is being actively investigated. This paper uses data collected as part of a large and unique Diabetes Screening Complications Research Initiative (DiScRi) in Australia with data from numerous tests related to diabetes to classify CAN progression. The present paper is devoted to recent experimental investigations of the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN. We present the results of experiments comparing the effectiveness of ADTree, J48, NBTree, RandomTree, REPTree and SimpleCart decision tree classifiers. Our results show that SimpleCart was the most effective for the DiScRi data set in classifying CAN. We also investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading, based on Ripple Down Rules as examples of ensemble classifiers. Further, we investigated the effectiveness of these ensemble methods as a function of the base classifiers, and determined that Random Forest performed best as a base classifier, and AdaBoost, Bagging and Decorate achieved the best outcomes as meta-classifiers in this setting. Finally, we investigated the meta-classifiers that performed best in their ability to enhance the performance further within the framework of a multi-level classification paradigm. Experimental results show that the multi-level paradigm performed best when Bagging and Decorate were combined in the construction of a multi-level ensemble classifier

Deakin Research Online

Directory of Open Access Journals

Empirical investigation of decision tree ensembles for monitoring cardiac complications of diabetes

Author: Abawajy Jemal
Jelinek Herbert F
Kelarev Andrei V
Stranieri Andrew
Publication venue: 'IGI Global'
Publication date: 01/01/2013
Field of study

Cardiac complications of diabetes require continuous monitoring since they may lead to increased morbidity or sudden death of patients. In order to monitor clinical complications of diabetes using wearable sensors, a small set of features have to be identified and effective algorithms for their processing need to be investigated. This article focuses on detecting and monitoring cardiac autonomic neuropathy (CAN) in diabetes patients. The authors investigate and compare the effectiveness of classifiers based on the following decision trees: ADTree, J48, NBTree, RandomTree, REPTree, and SimpleCart. The authors perform a thorough study comparing these decision trees as well as several decision tree ensembles created by applying the following ensemble methods: AdaBoost, Bagging, Dagging, Decorate, Grading, MultiBoost, Stacking, and two multi-level combinations of AdaBoost and MultiBoost with Bagging for the processing of data from diabetes patients for pervasive health monitoring of CAN. This paper concentrates on the particular task of applying decision tree ensembles for the detection and monitoring of cardiac autonomic neuropathy using these features. Experimental outcomes presented here show that the authors' application of the decision tree ensembles for the detection and monitoring of CAN in diabetes patients achieved better performance parameters compared with the results obtained previously in the literature

Deakin Research Online

Crossref

Federation ResearchOnline

Analyse locale de la forme 3D pour la reconnaissance d'expressions faciales

Author: Ben Amor Boulbaba
Daoudi Mohamed
Maalej Ahmed
Publication venue: HAL CCSD
Publication date: 01/06/2011
Field of study

National audienceIn this paper we propose a novel approach for indentityindependent 3D facial expression recognition. Our approach is based on shape analysis of local patches extracted from 3D facial shape model. A Riemannian framework is applied to compute geodesic distances between correspondent patches belonging to different faces of the BU-3DFE database and conveying different expressions. Quantitative measures of similarity are obtained and then used as inputs to several classification methods. Using Multiboosting and Support Vector Machines (SVM) classifiers, we achieved average recognition rates respectively equal to 98.81% and 97.75%.Dans cet article, nous proposons une nouvelle approche pour la reconnaissance d'expressions faciales 3D invariante par rapport à l'identité. Cette approche est basée sur l'analyse de formes de " patches "locaux extraits à partir de modèles de visages 3D. Un cadre Riemannien est utilisé pour le calcul de distances géodésiques entre les patches correspondants appartenant a des visages différents sous différentes expressions. Des mesures quantitatives de similarité sont alors obtenues et sont utilisées comme des paramètres d'entrée pour des algorithmes de classification multiclasses. En utilisant des techniques de Multiboosting et de Machines à Vecteurs de Support (SVM), les taux de reconnaissance des six expressions de base obtenus sur la base BU-3DFE sont respectivement 98.81% et 97.75%

HAL - Lille 3

INRIA a CCSD electronic archive server