20,812 research outputs found
A Method to Improve the Analysis of Cluster Ensembles
Clustering is fundamental to understand the structure of data. In the past decade the cluster ensembleproblem has been introduced, which combines a set of partitions (an ensemble) of the data to obtain a singleconsensus solution that outperforms all the ensemble members. However, there is disagreement about which arethe best ensemble characteristics to obtain a good performance: some authors have suggested that highly differentpartitions within the ensemble are beneï¬ cial for the ï¬ nal performance, whereas others have stated that mediumdiversity among them is better. While there are several measures to quantify the diversity, a better method toanalyze the best ensemble characteristics is necessary. This paper introduces a new ensemble generation strategyand a method to make slight changes in its structure. Experimental results on six datasets suggest that this isan important step towards a more systematic approach to analyze the impact of the ensemble characteristics onthe overall consensus performance.Fil: Pividori, Milton Damián. Universidad Tecnologica Nacional. Facultad Regional Santa Fe. Centro de Investigacion y Desarrollo de Ingenieria en Sistemas de Informacion; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Tecnologica Nacional. Facultad Regional Santa Fe. Centro de Investigacion y Desarrollo de Ingenieria en Sistemas de Informacion; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin
A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics
The combination of multiple classifiers using ensemble methods is
increasingly important for making progress in a variety of difficult prediction
problems. We present a comparative analysis of several ensemble methods through
two case studies in genomics, namely the prediction of genetic interactions and
protein functions, to demonstrate their efficacy on real-world datasets and
draw useful conclusions about their behavior. These methods include simple
aggregation, meta-learning, cluster-based meta-learning, and ensemble selection
using heterogeneous classifiers trained on resampled data to improve the
diversity of their predictions. We present a detailed analysis of these methods
across 4 genomics datasets and find the best of these methods offer
statistically significant improvements over the state of the art in their
respective domains. In addition, we establish a novel connection between
ensemble selection and meta-learning, demonstrating how both of these disparate
methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013
International Conference on Data Minin
- …