2 research outputs found

    Dataset analysis for classifier ensemble enhancement

    Get PDF
    We developed three different methods for dataset analysis and ensemble enhance- ment. They share the underlying idea that an accurate preprocessing and adap- tation of the data can improve the system performance, without changing the classification model. Correlation Score is a generic framework for assessing encoding techniques by measuring the correlation between the encoded feature vectors and the corresponding class labels; experiments show its effectiveness in discovering the best encoding configurations between those tested, on a wide range of classification domains. Multi-Resolution Complexity Analysis is a method for assessing the local complexity inside a given domain. It is able to split a domain into regions of different classification complexity, giving insights on the inner structure of the populations inside the domain. Finally, Forests of Local Trees are a novel training algorithm for ensemble classifiers. They are based on the concept of local trees: classifiers trained with a bias toward a certain region of the domain. This bias enhances the diversity inside the ensemble, leading to improved performance. These three topics are meant as a foundation for a more complex framework, that will eventually utilize them organically
    corecore