Search CORE

5,469 research outputs found

Visual Integration of Data and Model Space in Ensemble Learning

Author: Diehl Alexandra
Fuchs Johannes
Jäckle Dominik
Keim Daniel
Schneider Bruno
Stoffel Florian
Publication venue
Publication date: 01/01/2017
Field of study

Ensembles of classifier models typically deliver superior performance and can outperform single classifier models given a dataset and classification task at hand. However, the gain in performance comes together with the lack in comprehensibility, posing a challenge to understand how each model affects the classification outputs and where the errors come from. We propose a tight visual integration of the data and the model space for exploring and combining classifier models. We introduce a workflow that builds upon the visual integration and enables the effective exploration of classification outputs and models. We then present a use case in which we start with an ensemble automatically selected by a standard ensemble selection algorithm, and show how we can manipulate models and alternative combinations.Comment: 8 pages, 7 picture

arXiv.org e-Print Archive

Crossref

Non-uniform Feature Sampling for Decision Tree Ensembles

Author: Kyrillidis Anastasios
Zouzias Anastasios
Publication venue
Publication date: 24/03/2014
Field of study

We study the effectiveness of non-uniform randomized feature selection in decision tree classification. We experimentally evaluate two feature selection methodologies, based on information extracted from the provided dataset:

(i)

\emph{leverage scores-based} and

(ii)

\emph{norm-based} feature selection. Experimental evaluation of the proposed feature selection techniques indicate that such approaches might be more effective compared to naive uniform feature selection and moreover having comparable performance to the random forest algorithm [3]Comment: 7 pages, 7 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Toward a General-Purpose Heterogeneous Ensemble for Pattern Classification

Author: Brahnam Sheryl
Ghidoni Stefano
Lumini Alessandra
Nanni Loris
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

We perform an extensive study of the performance of different classification approaches on twenty-five datasets (fourteen image datasets and eleven UCI data mining datasets). The aim is to find General-Purpose (GP) heterogeneous ensembles (requiring little to no parameter tuning) that perform competitively across multiple datasets. The state-of-the-art classifiers examined in this study include the support vector machine, Gaussian process classifiers, random subspace of adaboost, random subspace of rotation boosting, and deep learning classifiers. We demonstrate that a heterogeneous ensemble based on the simple fusion by sum rule of different classifiers performs consistently well across all twenty-five datasets. The most important result of our investigation is demonstrating that some very recent approaches, including the heterogeneous ensemble we propose in this paper, are capable of outperforming an SVM classifier (implemented with LibSVM), even when both kernel selection and SVM parameters are carefully tuned for each dataset

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova

Missouri State University: BearWorks