416 research outputs found

    Feature Level Ensemble Method for Classifying Multi-Media Data

    Get PDF
    Multimedia data consists of several different types of data, such as numbers, text, images, audio etc. and they usually need to be fused or integrated before analysis. This study investigates a feature-level aggregation approach to combine multimedia datasets for building heterogeneous ensembles for classification. It firstly aggregates multimedia datasets at feature level to form a normalised big dataset, then uses some parts of it to generate classifiers with different learning algorithms. Finally, it applies three rules to select appropriate classifiers based on their accuracy and/or diversity to build heterogeneous ensembles. The method is tested on a multimedia dataset and the results show that the heterogeneous ensembles outperform the individual classifiers as well as homogeneous ensembles. However, it should be noted that, it is possible in some cases that the combined dataset does not produce better results than using single media data

    Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    Get PDF
    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different training signals for base networks in an ensemble can effectively promote diversity and improve ensemble performance. Here a Competitive Learning Neural Network Ensemble is proposed where a secondary output unit predicts the classification performance of the primary output unit in each base network. The networks compete with each other on the basis of classification performance and partition the stimulus space. The secondary units adaptively receive different training signals depending on the competition. As the result, each base network develops ¡°preference¡± over different regions of the stimulus space as indicated by their secondary unit outputs. To form an ensemble decision, all base networks¡¯ primary unit outputs are combined and weighted according to the secondary unit outputs. The effectiveness of the proposed approach is demonstrated with the experiments on one real-world and four artificial classification problems

    Diversity creation methods: a survey and categorisation

    Get PDF

    CCE: An ensemble architecture based on coupled ANN for solving multiclass problems

    Get PDF
    The resolution of multiclass classification problems has been usually addressed by using a "divide and conquer" strategy that splits the original problem into several binary subproblems. This approach is mandatory when the learning algorithm has been designed to solve binary problems and a multiclass version cannot be devised. Artificial Neural Networks, ANN, are binary learning models whose extension to multiclass problems is rather straightforward by using the standard 1-out-of N codification of the classes. However, the use of a single ANN can be inefficient in terms of accuracy and computational complexity when the data set is large, or the number of classes is high. In this work, we exhaustively describe CCE, a new classifier ensemble based on ANN. Each member of this new ensemble is a couple of multiclass ANN's. Each ANN is trained using different subsets of the dataset ensuring these subsets to be disjoint. This new approach allows to combine the benefits of the divide and conquer methodology, with the use of multiclass ANNs and with the combination of individual classification modules that give a complete answer to the addressed problem. The combination of these elements results in a classifier ensemble in which the diversity of the base classifiers provides high accuracy values. Moreover, the use of couples of ANN proves to be tolerant to labeling noise and computationally efficient. The performance of CCE has been tested on various datasets and the results show the higher performance of this approach with respect to other used classification systems.This research was supported by the Spanish MINECO under projects TRA2016-78886-C3-1-R and RTI2018-096036-B-C22

    Impact of the learners diversity and combination method on the generation of heterogeneous classifier ensembles

    Get PDF
    Ensembles of classifiers is a proven approach in machine learning with a wide variety of research works. The main issue in ensembles of classifiers is not only the selection of the base classifiers, but also the combination of their outputs. According to the literature, it has been established that much is to be gained from combining classifiers if those classifiers are accurate and diverse. However, it is still an open issue how to define the relation between accuracy and diversity in order to define the best possible ensemble of classifiers. In this paper, we propose a novel approach to evaluate the impact of the diversity of the learners on the generation of heterogeneous ensembles. We present an exhaustive study of this approach using 27 different multiclass datasets and analysing their results in detail. In addition, to determine the performance of the different results, the presence of labelling noise is also considered.This work has been supported under projects PEAVAUTO-CM-UC3M–2020/00036/001, PID2019-104793RB-C31, and RTI2018-096036-B-C22, and by the Region of Madrid’s Excellence Program, Spain (EPUC3M17)

    Heterogeneous Ensemble for Imaginary Scene Classification

    Get PDF
    In data mining, identifying the best individual technique to achieve very reliable and accurate classification has always been considered as an important but non-trivial task. This paper presents a novel approach - heterogeneous ensemble technique, to avoid the task and also to increase the accuracy of classification. It combines the models that are generated by using methodologically different learning algorithms and selected with different rules of utilizing both accuracy of individual modules and also diversity among the models. The key strategy is to select the most accurate model among all the generated models as the core model, and then select a number of models that are more diverse from the most accurate model to build the heterogeneous ensemble. The framework of the proposed approach has been implemented and tested on a real-world data to classify imaginary scenes. The results show our approach outperforms other the state of the art methods, including Bayesian network, SVM an d AdaBoost

    Nonlinear Boosting Projections for Ensemble Construction

    Get PDF
    In this paper we propose a novel approach for ensemble construction based on the use of nonlinear projections to achieve both accuracy and diversity of individual classifiers. The proposed approach combines the philosophy of boosting, putting more effort on difficult instances, with the basis of the random subspace method. Our main contribution is that instead of using a random subspace, we construct a projection taking into account the instances which have posed most difficulties to previous classifiers. In this way, consecutive nonlinear projections are created by a neural network trained using only incorrectly classified instances. The feature subspace induced by the hidden layer of this network is used as the input space to a new classifier. The method is compared with bagging and boosting techniques, showing an improved performance on a large set of 44 problems from the UCI Machine Learning Repository. An additional study showed that the proposed approach is less sensitive to noise in the data than boosting method

    Cooperative coevolution of artificial neural network ensembles for pattern classification

    Get PDF
    This paper presents a cooperative coevolutive approach for designing neural network ensembles. Cooperative coevolution is a recent paradigm in evolutionary computation that allows the effective modeling of cooperative environments. Although theoretically, a single neural network with a sufficient number of neurons in the hidden layer would suffice to solve any problem, in practice many real-world problems are too hard to construct the appropriate network that solve them. In such problems, neural network ensembles are a successful alternative. Nevertheless, the design of neural network ensembles is a complex task. In this paper, we propose a general framework for designing neural network ensembles by means of cooperative coevolution. The proposed model has two main objectives: first, the improvement of the combination of the trained individual networks; second, the cooperative evolution of such networks, encouraging collaboration among them, instead of a separate training of each network. In order to favor the cooperation of the networks, each network is evaluated throughout the evolutionary process using a multiobjective method. For each network, different objectives are defined, considering not only its performance in the given problem, but also its cooperation with the rest of the networks. In addition, a population of ensembles is evolved, improving the combination of networks and obtaining subsets of networks to form ensembles that perform better than the combination of all the evolved networks. The proposed model is applied to ten real-world classification problems of a very different nature from the UCI machine learning repository and proben1 benchmark set. In all of them the performance of the model is better than the performance of standard ensembles in terms of generalization error. Moreover, the size of the obtained ensembles is also smaller

    Diversified Ensemble Classifiers for Highly Imbalanced Data Learning and their Application in Bioinformatics

    Get PDF
    In this dissertation, the problem of learning from highly imbalanced data is studied. Imbalance data learning is of great importance and challenge in many real applications. Dealing with a minority class normally needs new concepts, observations and solutions in order to fully understand the underlying complicated models. We try to systematically review and solve this special learning task in this dissertation.We propose a new ensemble learning framework—Diversified Ensemble Classifiers for Imbal-anced Data Learning (DECIDL), based on the advantages of existing ensemble imbalanced learning strategies. Our framework combines three learning techniques: a) ensemble learning, b) artificial example generation, and c) diversity construction by reversely data re-labeling. As a meta-learner, DECIDL utilizes general supervised learning algorithms as base learners to build an ensemble committee. We create a standard benchmark data pool, which contains 30 highly skewed sets with diverse characteristics from different domains, in order to facilitate future research on imbalance data learning. We use this benchmark pool to evaluate and compare our DECIDL framework with several ensemble learning methods, namely under-bagging, over-bagging, SMOTE-bagging, and AdaBoost. Extensive experiments suggest that our DECIDL framework is comparable with other methods. The data sets, experiments and results provide a valuable knowledge base for future research on imbalance learning. We develop a simple but effective artificial example generation method for data balancing. Two new methods DBEG-ensemble and DECIDL-DBEG are then designed to improve the power of imbalance learning. Experiments show that these two methods are comparable to the state-of-the-art methods, e.g., GSVM-RU and SMOTE-bagging. Furthermore, we investigate learning on imbalanced data from a new angle—active learning. By combining active learning with the DECIDL framework, we show that the newly designed Active-DECIDL method is very effective for imbalance learning, suggesting the DECIDL framework is very robust and flexible.Lastly, we apply the proposed learning methods to a real-world bioinformatics problem—protein methylation prediction. Extensive computational results show that the DECIDL method does perform very well for the imbalanced data mining task. Importantly, the experimental results have confirmed our new contributions on this particular data learning problem
    • …
    corecore