185 research outputs found

    Bagging ensemble selection for regression

    Get PDF
    Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classiļ¬cation problems have shown that using random trees as base classiļ¬ers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classiļ¬cation, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles

    classification of oncologic data with genetic programming

    Get PDF
    Discovering the models explaining the hidden relationship between genetic material and tumor pathologies is one of the most important open challenges in biology and medicine. Given the large amount of data made available by the DNA Microarray technique, Machine Learning is becoming a popular tool for this kind of investigations. In the last few years, we have been particularly involved in the study of Genetic Programming for mining large sets of biomedical data. In this paper, we present a comparison between four variants of Genetic Programming for the classification of two different oncologic datasets: the first one contains data from healthy colon tissues and colon tissues affected by cancer; the second one contains data from patients affected by two kinds of leukemia (acute myeloid leukemia and acute lymphoblastic leukemia). We report experimental results obtained using two different fitness criteria: the receiver operating characteristic and the percentage of correctly classified instances. These results, and their comparison with the ones obtained by three nonevolutionary Machine Learning methods (Support Vector Machines, MultiBoosting, and Random Forests) on the same data, seem to hint that Genetic Programming is a promising technique for this kind of classification

    Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Get PDF
    Article discussing research on classifying genes to the correct gene ontology slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Empirical investigation of decision tree ensembles for monitoring cardiac complications of diabetes

    Full text link
    Cardiac complications of diabetes require continuous monitoring since they may lead to increased morbidity or sudden death of patients. In order to monitor clinical complications of diabetes using wearable sensors, a small set of features have to be identified and effective algorithms for their processing need to be investigated. This article focuses on detecting and monitoring cardiac autonomic neuropathy (CAN) in diabetes patients. The authors investigate and compare the effectiveness of classifiers based on the following decision trees: ADTree, J48, NBTree, RandomTree, REPTree, and SimpleCart. The authors perform a thorough study comparing these decision trees as well as several decision tree ensembles created by applying the following ensemble methods: AdaBoost, Bagging, Dagging, Decorate, Grading, MultiBoost, Stacking, and two multi-level combinations of AdaBoost and MultiBoost with Bagging for the processing of data from diabetes patients for pervasive health monitoring of CAN. This paper concentrates on the particular task of applying decision tree ensembles for the detection and monitoring of cardiac autonomic neuropathy using these features. Experimental outcomes presented here show that the authors' application of the decision tree ensembles for the detection and monitoring of CAN in diabetes patients achieved better performance parameters compared with the results obtained previously in the literature

    Decision trees and multi-level ensemble classifiers for neurological diagnostics

    Full text link
    Cardiac autonomic neuropathy (CAN) is a well known complication of diabetes leading to impaired regulation of blood pressure and heart rate, and increases the risk of cardiac associated mortality of diabetes patients. The neurological diagnostics of CAN progression is an important problem that is being actively investigated. This paper uses data collected as part of a large and unique Diabetes Screening Complications Research Initiative (DiScRi) in Australia with data from numerous tests related to diabetes to classify CAN progression. The present paper is devoted to recent experimental investigations of the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN. We present the results of experiments comparing the effectiveness of ADTree, J48, NBTree, RandomTree, REPTree and SimpleCart decision tree classifiers. Our results show that SimpleCart was the most effective for the DiScRi data set in classifying CAN. We also investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading, based on Ripple Down Rules as examples of ensemble classifiers. Further, we investigated the effectiveness of these ensemble methods as a function of the base classifiers, and determined that Random Forest performed best as a base classifier, and AdaBoost, Bagging and Decorate achieved the best outcomes as meta-classifiers in this setting. Finally, we investigated the meta-classifiers that performed best in their ability to enhance the performance further within the framework of a multi-level classification paradigm. Experimental results show that the multi-level paradigm performed best when Bagging and Decorate were combined in the construction of a multi-level ensemble classifier

    A neural network approach to audio-assisted movie dialogue detection

    Get PDF
    A novel framework for audio-assisted dialogue detection based on indicator functions and neural networks is investigated. An indicator function defines that an actor is present at a particular time instant. The cross-correlation function of a pair of indicator functions and the magnitude of the corresponding cross-power spectral density are fed as input to neural networks for dialogue detection. Several types of artificial neural networks, including multilayer perceptrons, voted perceptrons, radial basis function networks, support vector machines, and particle swarm optimization-based multilayer perceptrons are tested. Experiments are carried out to validate the feasibility of the aforementioned approach by using ground-truth indicator functions determined by human observers on 6 different movies. A total of 41 dialogue instances and another 20 non-dialogue instances is employed. The average detection accuracy achieved is high, ranging between 84.78%Ā±5.499% and 91.43%Ā±4.239%
    • ā€¦
    corecore