6,356 research outputs found
Investigating Evaluation Measures in Ant Colony Algorithms for Learning Decision Tree Classifiers
Ant-Tree-Miner is a decision tree induction algorithm that is based on the Ant Colony Optimization (ACO) meta- heuristic. Ant-Tree-Miner-M is a recently introduced extension of Ant-Tree-Miner that learns multi-tree classification models. A multi-tree model consists of multiple decision trees, one for each class value, where each class-based decision tree is responsible for discriminating between its class value and all other values present in the class domain (one vs. all). In this paper, we investigate the use of 10 different classification quality evaluation measures in Ant-Tree-Miner-M, which are used for both candidate model evaluation and model pruning. Our experimental results, using 40 popular benchmark datasets, identify several quality functions that substantially improve on the simple Accuracy quality function that was previously used in Ant-Tree-Miner-M
Learning Bayesian network equivalence classes using ant colony optimisation
Bayesian networks have become an indispensable tool in the modelling of uncertain
knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the
structure, and conditional probability distributions attached to each node known as the
parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated,
when faced with an uncertain problem domain. However, a recurring problem persists
in specifying a Bayesian network. Both the structure and parameters can be difficult for
experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure
and parameters of Bayesian networks from data. Whilst there are simple methods for
learning the parameters, learning the structure has proved harder. Part ofthis stems from
the NP-hardness of the problem and the super-exponential space of possible structures.
To help solve this task, this thesis seeks to employ a relatively new technique, that has
had much success in tackling NP-hard problems. This technique is called ant colony
optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants
acting together in a colony. It uses the stochastic activity of artificial ants to find good
solutions to combinatorial optimisation problems. In the current work, this method is
applied to the problem of searching through the space of equivalence classes of Bayesian
networks, in order to find a good match against a set of data. The system uses operators
that evaluate potential modifications to a current state. Each of the modifications is
scored and the results used to inform the search. In order to facilitate these steps, other
techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning
structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural
similarity metrics and Bayesian scoring metrics. The results are compared in depth
against systems that also use ant colony optimisation and other methods, including
evolutionary programming and greedy heuristics. Also, comparisons are made to well
known state-of-the-art algorithms and a study performed on a real-life data set. The
results show favourable performance compared to the other methods and on modelling
the real-life data
Problem dependent metaheuristic performance in Bayesian network structure learning.
Bayesian network (BN) structure learning from data has been an active research area in the machine learning field in recent decades. Much of the research has considered BN structure learning as an optimization problem. However, the finding of optimal BN from data is NP-hard. This fact has driven the use of heuristic algorithms for solving this kind of problem. Amajor recent focus in BN structure learning is on search and score algorithms. In these algorithms, a scoring function is introduced and a heuristic search algorithm is used to evaluate each network with respect to the training data. The optimal network is produced according to the best score evaluated. This thesis investigates a range of search and score algorithms to understand the relationship between technique performance and structure features of the problems. The main contributions of this thesis include (a) Two novel Ant Colony Optimization based search and score algorithms for BN structure learning; (b) Node juxtaposition distribution for studying the relationship between the best node ordering and the optimal BN structure; (c) Fitness landscape analysis for investigating the di erent performances of both chain score function and the CH score function; (d) A classifier method is constructed by utilizing receiver operating characteristic curve with the results on fitness landscape analysis; and finally (e) a selective o -line hyperheuristic algorithm is built for unseen BN structure learning with search and score algorithms. In this thesis, we also construct a new algorithm for producing BN benchmark structures and apply our novel approaches to a range of benchmark problems and real world problem
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
Learning Bayesian network equivalence classes with ant colony optimization
Bayesian networks are a useful tool in the representation of uncertain knowledge. This paper proposes a new algorithm called ACO-E, to learn the structure of a Bayesian network. It does this by conducting a search through the space of equivalence classes of Bayesian networks using Ant Colony Optimization (ACO). To this end, two novel extensions of traditional ACO techniques are proposed and implemented. Firstly, multiple types of moves are allowed. Secondly, moves can be given in terms of indices that are not based on construction graph nodes. The results of testing show that ACO-E performs better than a greedy search and other state-of-the-art and metaheuristic algorithms whilst searching in the space of equivalence classe
- …