20,812 research outputs found

    A Method to Improve the Analysis of Cluster Ensembles

    Get PDF
    Clustering is fundamental to understand the structure of data. In the past decade the cluster ensembleproblem has been introduced, which combines a set of partitions (an ensemble) of the data to obtain a singleconsensus solution that outperforms all the ensemble members. However, there is disagreement about which arethe best ensemble characteristics to obtain a good performance: some authors have suggested that highly differentpartitions within the ensemble are beneï¬ cial for the ï¬ nal performance, whereas others have stated that mediumdiversity among them is better. While there are several measures to quantify the diversity, a better method toanalyze the best ensemble characteristics is necessary. This paper introduces a new ensemble generation strategyand a method to make slight changes in its structure. Experimental results on six datasets suggest that this isan important step towards a more systematic approach to analyze the impact of the ensemble characteristics onthe overall consensus performance.Fil: Pividori, Milton Damián. Universidad Tecnologica Nacional. Facultad Regional Santa Fe. Centro de Investigacion y Desarrollo de Ingenieria en Sistemas de Informacion; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Tecnologica Nacional. Facultad Regional Santa Fe. Centro de Investigacion y Desarrollo de Ingenieria en Sistemas de Informacion; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

    Full text link
    The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013 International Conference on Data Minin
    • …
    corecore