285 research outputs found
Linear and Order Statistics Combiners for Pattern Classification
Several researchers have experimentally shown that substantial improvements
can be obtained in difficult pattern recognition problems by combining or
integrating the outputs of multiple classifiers. This chapter provides an
analytical framework to quantify the improvements in classification results due
to combining. The results apply to both linear combiners and order statistics
combiners. We first show that to a first order approximation, the error rate
obtained over and above the Bayes error rate, is directly proportional to the
variance of the actual decision boundaries around the Bayes optimum boundary.
Combining classifiers in output space reduces this variance, and hence reduces
the "added" error. If N unbiased classifiers are combined by simple averaging,
the added error rate can be reduced by a factor of N if the individual errors
in approximating the decision boundaries are uncorrelated. Expressions are then
derived for linear combiners which are biased or correlated, and the effect of
output correlations on ensemble performance is quantified. For order statistics
based non-linear combiners, we derive expressions that indicate how much the
median, the maximum and in general the ith order statistic can improve
classifier performance. The analysis presented here facilitates the
understanding of the relationships among error rates, classifier boundary
distributions, and combining in output space. Experimental results on several
public domain data sets are provided to illustrate the benefits of combining
and to support the analytical results.Comment: 31 page
Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems
Combining classifiers by majority voting (MV) has
recently emerged as an effective way of improving
performance of individual classifiers. However, the
usefulness of applying MV is not always observed and
is subject to distribution of classification outputs in a
multiple classifier system (MCS). Evaluation of MV
errors (MVE) for all combinations of classifiers in MCS
is a complex process of exponential complexity.
Reduction of this complexity can be achieved provided
the explicit relationship between MVE and any other
less complex function operating on classifier outputs is
found. Diversity measures operating on binary
classification outputs (correct/incorrect) are studied in
this paper as potential candidates for such functions.
Their correlation with MVE, interpreted as the quality
of a measure, is thoroughly investigated using artificial
and real-world datasets. Moreover, we propose new
diversity measure efficiently exploiting information
coming from the whole MCS, rather than its part, for
which it is applied
Recommended from our members
Predicting business failure using artificial intelligence system
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPredicting business insolvency is considered one of the main supportive sources of information
for decision making for financial institutions, investors, creditors, and other participants in the
business market. Financial reporting systems provide relevant information that can be used to
assess the financial position of firms. It is crucial to have classification and prediction models
that can analyse this financial information and provide accurate assurance for users about
business health. Recent studies have explored the use of machine learning tools as substitute
for traditional statistical methods to develop classification models to classify firm insolvency
according to financial statement information. However, these models have no ideal classifier,
since each provides a certain percentage of wrong outputs, which is a crucial consideration;
every percentage of wrong response can mean massive financial losses for stakeholders.
Therefore, this study proposes new insolvency classification and perdition models based on
machine learning modelling techniques to develop an improved classifier.
Individual modelling techniques using statistical methods and machine learning were used to
develop the classification model of business insolvency. The results showed that machine
learning method outperformed statistical methods. Deep Learning (DPL) achieved the highest
performance based on all performance measurements used in the study, and it was the best
individual classifier, with average accuracy of 97.2% using all-years dataset. Ensemble-
Boosted Decision Tree classifier ranked second, followed by Decision Tree classifier. Thus, it
has been proven that DPL modelling approach is useful for business insolvency classification.
A key contribution in enhancing individual classifier outputs is the use of traditional combining
methods with two new aggregation methods in business insolvency (Fuzzy Logic and
Consensus Approach). The Consensus Approach showed the best improvement in the results
of all individual classifiers with average accuracy of 97.7%, and it is considered the best
classification method not only in comparison with individual classifiers, but also with
traditional combiners.
This study pioneers the development of a time series business insolvency prediction model
with Big Data for UK businesses. The aim of the model is to provide early prediction about a
business health. Three prediction models were developed based on Nonlinear Autoregressive
with Exogenous Input models (NARX), Nonlinear Autoregressive Neural Network (NAR),
and Deep Learning Time-series model (DPL-SA) and achieved average accuracy rates of
83.6%, 89.5%, and 91.35%, respectively. The results show relatively high performance in
comparison with the best individual classifier (deep learning)
Multiple classifiers in biometrics. part 1: Fundamentals and review
We provide an introduction to Multiple Classifier Systems (MCS) including basic nomenclature and describing key elements: classifier dependencies, type of classifier outputs, aggregation procedures, architecture, and types of methods. This introduction complements other existing overviews of MCS, as here we also review the most prevalent theoretical framework for MCS and discuss theoretical developments related to MCS
The introduction to MCS is then followed by a review of the application of MCS to the particular field of multimodal biometric person authentication in the last 25 years, as a prototypical area in which MCS has resulted in important achievements. This review includes general descriptions of successful MCS methods and architectures in order to facilitate the export of them to other information fusion problems.
Based on the theory and framework introduced here, in the companion paper we then develop in more technical detail recent trends and developments in MCS from multimodal biometrics that incorporate context information in an adaptive way. These new MCS architectures exploit input quality measures and pattern-specific particularities that move apart from general population statistics, resulting in robust multimodal biometric systems. Similarly as in the present paper, methods in the companion paper are introduced in a general way so they can be applied to other information fusion problems as well. Finally, also in the companion paper, we discuss open challenges in biometrics and the role of MCS to advance themThis work was funded by projects CogniMetrics (TEC2015-70627-R)
from MINECO/FEDER and RiskTrakc (JUST-2015-JCOO-AG-1). Part of thisthis work was conducted during a research visit of J.F. to Prof. Ludmila Kuncheva at Bangor University (UK) with STSM funding from COST CA16101 (MULTI-FORESEE
Evolving Ensembles with TPOT
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceMachine learning has become popular in recent years as a solution to various problems such as fraud detection, weather prediction, improve diagnosis accuracy, and more. One of its goals is to find the model that best explains the problem. Among the several alternatives on how to accomplish that, significant attention has been laid on the matter of accuracy using stacking ensembles: the objective is to produce a more accurate prediction by combining the predictions of various estimators. This model has often been exhibiting a superior performance in contrast to its single counterparts. Because the process of choosing the best model for a given problem can be time-consuming, a necessity to automatize the machine learning process has emerged. Different tools allow this, including TPOT, a Python library that uses genetic programming to optimize the machine learning process, evolving pipelines randomly created until the best one is found, or a previously fixed maximum number of generations for the given problem is reached. Genetic programming is a field of machine learning that uses evolutionary algorithms to generate new computer programs, and it has been shown successful in quite a few applications. TPOT uses several machine learning algorithms from the Sklearn Python library. It also features some ensembles, such as Random Forest or AdaBoost. Currently, stacking ensembles are not implemented yet on TPOT, and, considering its current accuracy rates, the objective of this thesis is to implement stacking ensembles in TPOT. After we implemented stacking ensembles successfully in TPOT, we performed some experiments with different datasets and noticed that for almost all of them, TPOT has comparable performance to TPOT with stacking ensembles. Also, we observed that, when using the light dictionary version of TPOT, the results of the Stacking configuration improved for two datasets since it used weaker learners
Local learning for multi-layer, multi-component predictive system
This study introduces a new multi-layer multi-component ensemble. The components of this ensemble are trained locally on subsets of features for disjoint sets of data. The data instances are assigned to local regions using the similarity of their features pairwise squared correlation. Many ensemble methods encourage diversity among their base predictors by training them on different subsets of data or different subsets of features. In the proposed architecture the local regions contain disjoint sets of data and for this data only the most similar features are selected. The pairwise squared correlations of the features are used to weight the predictions of the ensemble's models. The proposed architecture has been tested on a number of data sets and its performance was compared to five benchmark algorithms. The results showed that the testing accuracy of the developed architecture is comparable to the rotation forest and is better than the other benchmark algorithms
- …