1,118 research outputs found
Learning Hybrid Neuro-Fuzzy Classifier Models From Data: To Combine or Not to Combine?
To combine or not to combine? Though not a question of the same gravity as the Shakespeare’s to be or not
to be, it is examined in this paper in the context of a hybrid neuro-fuzzy pattern classifier design process. A general fuzzy
min-max neural network with its basic learning procedure is used within six different algorithm independent learning
schemes. Various versions of cross-validation, resampling techniques and data editing approaches, leading to a generation
of a single classifier or a multiple classifier system, are scrutinised and compared. The classification performance on
unseen data, commonly used as a criterion for comparing different competing designs, is augmented by further four
criteria attempting to capture various additional characteristics of classifier generation schemes. These include: the ability
to estimate the true classification error rate, the classifier transparency, the computational complexity of the learning
scheme and the potential for adaptation to changing environments and new classes of data. One of the main questions
examined is whether and when to use a single classifier or a combination of a number of component classifiers within a
multiple classifier system
Data Editing for Neuro-Fuzzy Classifiers
In this paper we investigate the potential benefits and
limitations of various data editing procedures when
constructing neuro-fuzzy classifiers based on hyperbox
fuzzy sets. There are two major aspects of data editing
which we are attempting to exploit: a) removal of outliers
and noisy data; and b) reduction of training data size. We
show that successful training data editing can result in
constructing simpler classifiers (i.e. a classifier with a
smaller number and larger hyperboxes) with better
generalisation performance. However we also indicate
the potential dangers of overediting which can lead to
dropping the whole regions of a class and constructing
too simple classifiers not able to capture the class
boundaries with high enough accuracy. A more flexible
approach than the existing data editing techniques based
on estimating probabilities used to decide whether a
point should be removed from the training set has been
proposed. An analysis and graphical interpretations are
given for the synthetic, non-trivial, 2-dimensional
classification problems
Combining Neuro-Fuzzy Classifiers for Improved Generalisation and Reliability
In this paper a combination of neuro-fuzzy
classifiers for improved classification performance and reliability
is considered. A general fuzzy min-max (GFMM) classifier with
agglomerative learning algorithm is used as a main building
block. An alternative approach to combining individual classifier
decisions involving the combination at the classifier model level is
proposed. The resulting classifier complexity and transparency is
comparable with classifiers generated during a single crossvalidation
procedure while the improved classification
performance and reduced variance is comparable to the ensemble
of classifiers with combined (averaged/voted) decisions. We also
illustrate how combining at the model level can be used for
speeding up the training of GFMM classifiers for large data sets
Combining Labelled and Unlabelled Data in the Design of Pattern Classification Systems
There has been much interest in applying techniques that incorporate knowledge from unlabelled data
into a supervised learning system but less effort has been made to compare the effectiveness of different approaches on
real world problems and to analyse the behaviour of the learning system when using different amount of unlabelled data.
In this paper an analysis of the performance of supervised methods enforced by unlabelled data and some semisupervised
approaches using different ratios of labelled to unlabelled samples is presented. The experimental results
show that when supported by unlabelled samples much less labelled data is generally required to build a classifier
without compromising the classification performance. If only a very limited amount of labelled data is available the
results show high variability and the performance of the final classifier is more dependant on how reliable the labelled
data samples are rather than use of additional unlabelled data. Semi-supervised clustering utilising both labelled and
unlabelled data have been shown to offer most significant improvements when natural clusters are present in the
considered problem
Simulation of Water Distribution Systems
In this paper a software package offering a means of simulating
complex water distribution systems is described. It has been
developed in the course of our investigations into the applicability
of neural networks and fuzzy systems for the implementation of
decision support systems in operational control of industrial
processes with case-studies taken from the water industry.
Examples of how the simulation package have been used in a
design and testing of the algorithms for state estimation,
confidence limit analysis and fault detection are presented.
Arguments for using a suitable graphical visualization techniques
in solving problems like meter placement or leakage diagnosis are
also given and supported by a set of examples
Application of Computational Intelligence Techniques to Process Industry Problems
In the last two decades there has been a large progress in the computational
intelligence research field. The fruits of the effort spent on the research in the discussed
field are powerful techniques for pattern recognition, data mining, data modelling, etc.
These techniques achieve high performance on traditional data sets like the UCI
machine learning database. Unfortunately, this kind of data sources usually represent
clean data without any problems like data outliers, missing values, feature co-linearity,
etc. common to real-life industrial data. The presence of faulty data samples can have
very harmful effects on the models, for example if presented during the training of the
models, it can either cause sub-optimal performance of the trained model or in the worst
case destroy the so far learnt knowledge of the model. For these reasons the application
of present modelling techniques to industrial problems has developed into a research
field on its own. Based on the discussion of the properties and issues of the data and the
state-of-the-art modelling techniques in the process industry, in this paper a novel
unified approach to the development of predictive models in the process industry is
presented
Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems
Combining classifiers by majority voting (MV) has
recently emerged as an effective way of improving
performance of individual classifiers. However, the
usefulness of applying MV is not always observed and
is subject to distribution of classification outputs in a
multiple classifier system (MCS). Evaluation of MV
errors (MVE) for all combinations of classifiers in MCS
is a complex process of exponential complexity.
Reduction of this complexity can be achieved provided
the explicit relationship between MVE and any other
less complex function operating on classifier outputs is
found. Diversity measures operating on binary
classification outputs (correct/incorrect) are studied in
this paper as potential candidates for such functions.
Their correlation with MVE, interpreted as the quality
of a measure, is thoroughly investigated using artificial
and real-world datasets. Moreover, we propose new
diversity measure efficiently exploiting information
coming from the whole MCS, rather than its part, for
which it is applied
Nature-Inspired Learning Models
Intelligent learning mechanisms found in natural world are still unsurpassed in their learning performance and eficiency of dealing with uncertain information coming in a variety of forms, yet remain under continuous challenge
from human driven artificial intelligence methods. This work intends to demonstrate how the phenomena observed in physical world can be directly used to guide artificial learning models. An inspiration for the new
learning methods has been found in the mechanics of physical fields found in both micro and macro scale.
Exploiting the analogies between data and particles subjected to gravity, electrostatic and gas particle fields, new algorithms have been developed and applied to classification and clustering while the properties of the
field further reused in regression and visualisation of classification and classifier fusion. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some testing over
the well-known real and artificial datasets, compared when possible to the traditional methods
Adaptive Mechanisms in an Airline Ticket Demand Forecasting System
Adaptivity is a very important feature for industrial forecast systems. In the airline industry, a reliable forecasting
of a demand for tickets at different fare levels forms a crucial step in a global optimization process, the objective of which is
to sell a restricted number of available seats in a plane with a maximized revenue. Due to continuously changing demand
caused by seasonality, special events like holidays or fairs, changes in the flight schedules or changes of the political or cultural
situation of a country, there is a need for robust, adaptive forecasting techniques able to cope with such changes. In this paper
an overview of various adaptive mechanisms used in the new forecasting system of the Lufthansa Airline is presented
- …