5,176 research outputs found
An Approach to Pattern Recognition by Evolutionary Computation
Evolutionary Computation has been inspired by the natural phenomena of evolution. It provides a quite general heuristic, exploiting few basic concepts: reproduction of individuals, variation phenomena that affect the likelihood of survival of individuals, inheritance of parents features by offspring. EC has been widely used in the last years to effectively solve hard, non linear and very complex problems.
Among the others, ECābased algorithms have also been used to tackle
classification problems. Classification is a process according to which an object is attributed to one of a finite set of classes or, in other words, it is recognized as belonging to a set of equal or similar entities, identified by a label. Most likely, the main aspect of classification concerns the generation of prototypes to be used to recognize unknown patterns. The role of prototypes is that of representing patterns belonging to the different classes defined within a given problem. For most of the problems of practical interest, the generation of such prototypes is a very hard problem, since a prototype must be able to represent patterns belonging to the same class, which may be significantly dissimilar each other. They must also be able to discriminate patterns belonging to classes different from the one that they represent. Moreover, a prototype should contain the minimum amount of information required to satisfy the requirements just mentioned. The research presented in this thesis, has led to the definition of an ECābased framework to be used for prototype generation. The defined framework does not provide for the use of any particular kind of prototypes. In fact, it can generate any kind of prototype once an encoding scheme for the used prototypes has been defined. The generality of the framework can be exploited to develop many applications. The framework has been employed to implement two specific applications for prototype generation.
The developed applications have been tested on several data sets and the results compared with those obtained by other approaches previously presented in the literature
Error management in ATLAS TDAQ : an intelligent systems approach
This thesis is concerned with the use of intelligent system techniques (IST) within
a large distributed software system, specifically the ATLAS TDAQ system which
has been developed and is currently in use at the European Laboratory for Particle
Physics(CERN). The overall aim is to investigate and evaluate a range of ITS
techniques in order to improve the error management system (EMS) currently used
within the TDAQ system via error detection and classification. The thesis work
will provide a reference for future research and development of such methods in the
TDAQ system.
The thesis begins by describing the TDAQ system and the existing EMS, with a
focus on the underlying expert system approach, in order to identify areas where
improvements can be made using IST techniques. It then discusses measures of
evaluating error detection and classification techniques and the factors specific to
the TDAQ system.
Error conditions are then simulated in a controlled manner using an experimental
setup and datasets were gathered from two different sources. Analysis and processing
of the datasets using statistical and ITS techniques shows that clusters exists in
the data corresponding to the different simulated errors.
Different ITS techniques are applied to the gathered datasets in order to realise an
error detection model. These techniques include Artificial Neural Networks (ANNs),
Support Vector Machines (SVMs) and Cartesian Genetic Programming (CGP) and
a comparison of the respective advantages and disadvantages is made.
The principle conclusions from this work are that IST can be successfully used to
detect errors in the ATLAS TDAQ system and thus can provide a tool to improve
the overall error management system. It is of particular importance that the IST can
be used without having a detailed knowledge of the system, as the ATLAS TDAQ
is too complex for a single person to have complete understanding of. The results
of this research will benefit researchers developing and evaluating IST techniques in
similar large scale distributed systems
Bioinformatics Applications Based On Machine Learning
The great advances in information technology (IT) have implications for many sectors, such as bioinformatics, and has considerably increased their possibilities. This book presents a collection of 11 original research papers, all of them related to the application of IT-related techniques within the bioinformatics sector: from new applications created from the adaptation and application of existing techniques to the creation of new methodologies to solve existing problems
Mining Aircraft Telemetry Data With Evolutionary Algorithms
The Ganged Phased Array Radar - Risk Mitigation System (GPAR-RMS) was a
mobile ground-based sense-and-avoid system for Unmanned Aircraft System (UAS)
operations developed by the University of North Dakota. GPAR-RMS detected proximate
aircraft with various sensor systems, including a 2D radar and an Automatic Dependent
Surveillance - Broadcast (ADS-B) receiver. Information about those aircraft was then
displayed to UAS operators via visualization software developed by the University of
North Dakota. The Risk Mitigation (RM) subsystem for GPAR-RMS was designed to
estimate the current risk of midair collision, between the Unmanned Aircraft (UA) and a
General Aviation (GA) aircraft flying under Visual Flight Rules (VFR) in the surrounding
airspace, for UAS operations in Class E airspace (i.e. below 18,000 feet MSL). However,
accurate probabilistic models for the behavior of pilots of GA aircraft flying under VFR
in Class E airspace were needed before the RM subsystem could be implemented.
In this dissertation the author presents the results of data mining an aircraft
telemetry data set from a consecutive nine month period in 2011. This aircraft telemetry
data set consisted of Flight Data Monitoring (FDM) data obtained from Garmin G1000
devices onboard every Cessna 172 in the University of North Dakota\u27s training fleet.
Data from aircraft which were potentially within the controlled airspace surrounding
controlled airports were excluded. Also, GA aircraft in the FDM data flying in Class E
airspace were assumed to be flying under VFR, which is usually a valid assumption.
Complex subpaths were discovered from the aircraft telemetry data set using a novel
application of an ant colony algorithm. Then, probabilistic models were data mined from
those subpaths using extensions of the Genetic K-Means (GKA) and Expectation-
Maximization (EM) algorithms.
The results obtained from the subpath discovery and data mining suggest a pilot
flying a GA aircraft near to an uncontrolled airport will perform different maneuvers than
a pilot flying a GA aircraft far from an uncontrolled airport, irrespective of the altitude of
the GA aircraft. However, since only aircraft telemetry data from the University of North
Dakota\u27s training fleet were data mined, these results are not likely to be applicable to GA
aircraft operating in a non-training environment
A Field Guide to Genetic Programming
xiv, 233 p. : il. ; 23 cm.Libro ElectrĆ³nicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction --
Representation, initialisation and operators in Tree-based GP --
Getting ready to run genetic programming --
Example genetic programming run --
Alternative initialisations and operators in Tree-based GP --
Modular, grammatical and developmental Tree-based GP --
Linear and graph genetic programming --
Probalistic genetic programming --
Multi-objective genetic programming --
Fast and distributed genetic programming --
GP theory and its applications --
Applications --
Troubleshooting GP --
Conclusions.Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell
1.2 Getting Started
1.3 Prerequisites
1.4 Overview of this Field Guide I
Basics
2 Representation, Initialisation and GP
2.1 Representation
2.2 Initialising the Population
2.3 Selection
2.4 Recombination and Mutation Operators in Tree-based
3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set 19
3.2 Step 2: Function Set 20
3.2.1 Closure 21
3.2.2 Sufficiency 23
3.2.3 Evolving Structures other than Programs 23
3.3 Step 3: Fitness Function 24
3.4 Step 4: GP Parameters 26
3.5 Step 5: Termination and solution designation 27
4 Example Genetic Programming Run
4.1 Preparatory Steps 29
4.2 Step-by-Step Sample Run 31
4.2.1 Initialisation 31
4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming
5 Alternative Initialisations and Operators in
5.1 Constructing the Initial Population
5.1.1 Uniform Initialisation
5.1.2 Initialisation may Affect Bloat
5.1.3 Seeding
5.2 GP Mutation
5.2.1 Is Mutation Necessary?
5.2.2 Mutation Cookbook
5.3 GP Crossover
5.4 Other Techniques 32
5.5 Tree-based GP 39
6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures 47
6.1.1 Automatically Defined Functions 48
6.1.2 Program Architecture and Architecture-Altering 50
6.2 Constraining Structures 51
6.2.1 Enforcing Particular Structures 52
6.2.2 Strongly Typed GP 52
6.2.3 Grammar-based Constraints 53
6.2.4 Constraints and Bias 55
6.3 Developmental Genetic Programming 57
6.4 Strongly Typed Autoconstructive GP with PushGP 59
7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming 61
7.1.1 Motivations 61
7.1.2 Linear GP Representations 62
7.1.3 Linear GP Operators 64
7.2 Graph-Based Genetic Programming 65
7.2.1 Parallel Distributed GP (PDGP) 65
7.2.2 PADO 67
7.2.3 Cartesian GP 67
7.2.4 Evolving Parallel Programs using Indirect Encodings 68
8 Probabilistic Genetic Programming
8.1 Estimation of Distribution Algorithms 69
8.2 Pure EDA GP 71
8.3 Mixing Grammars and Probabilities 74
9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate 76
9.2.1 Multi-objective Bloat and Complexity Control 77
9.2.2 Other Objectives 78
9.2.3 Non-Pareto Criteria 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias 81
10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83
10.2 Reducing Cost of Fitness with Caches 86
10.3 Parallel and Distributed GP are Not Equivalent 88
10.4 Running GP on Parallel Hardware 89
10.4.1 Masterāslave GP 89
10.4.2 GP Running on GPUs 90
10.4.3 GP on FPGAs 92
10.4.4 Sub-machine-code GP 93
10.5 Geographically Distributed GP 93
11 GP Theory and its Applications 97
11.1 Mathematical Models 98
11.2 Search Spaces 99
11.3 Bloat 101
11.3.1 Bloat in Theory 101
11.3.2 Bloat Control in Practice 104
III
Practical Genetic Programming
12 Applications
12.1 Where GP has Done Well
12.2 Curve Fitting, Data Modelling and Symbolic Regression
12.3 Human Competitive Results ā the Humies
12.4 Image and Signal Processing
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control
12.7 Medicine, Biology and Bioinformatics
12.8 GP to Create Searchers and Solvers ā Hyper-heuristics xiii
12.9 Entertainment and Computer Games 127
12.10The Arts 127
12.11Compression 128
13 Troubleshooting GP
13.1 Is there a Bug in the Code?
13.2 Can you Trust your Results?
13.3 There are No Silver Bullets
13.4 Small Changes can have Big Effects
13.5 Big Changes can have No Effect
13.6 Study your Populations
13.7 Encourage Diversity
13.8 Embrace Approximation
13.9 Control Bloat
13.10 Checkpoint Results
13.11 Report Well
13.12 Convince your Customers
14 Conclusions
Tricks of the Trade
A Resources
A.1 Key Books
A.2 Key Journals
A.3 Key International Meetings
A.4 GP Implementations
A.5 On-Line Resources 145
B TinyGP 151
B.1 Overview of TinyGP 151
B.2 Input Data Files for TinyGP 153
B.3 Source Code 154
B.4 Compiling and Running TinyGP 162
Bibliography 167
Inde
Data mining using neural networks
Data mining is about the search for relationships and global patterns in large databases that are increasing in size. Data mining is beneficial for anyone who has a huge amount of data, for example, customer and business data, transaction, marketing, financial, manufacturing and web data etc. The results of data mining are also referred to as knowledge in the form of rules, regularities and constraints. Rule mining is one of the popular data mining methods since rules provide concise statements of potentially important information that is easily understood by end users and also actionable patterns. At present rule mining has received a good deal of attention and enthusiasm from data mining researchers since rule mining is capable of solving many data mining problems such as classification, association, customer profiling, summarization, segmentation and many others. This thesis makes several contributions by proposing rule mining methods using genetic algorithms and neural networks. The thesis first proposes rule mining methods using a genetic algorithm. These methods are based on an integrated framework but capable of mining three major classes of rules. Moreover, the rule mining processes in these methods are controlled by tuning of two data mining measures such as support and confidence. The thesis shows how to build data mining predictive models using the resultant rules of the proposed methods. Another key contribution of the thesis is the proposal of rule mining methods using supervised neural networks. The thesis mathematically analyses the Widrow-Hoff learning algorithm of a single-layered neural network, which results in a foundation for rule mining algorithms using single-layered neural networks. Three rule mining algorithms using single-layered neural networks are proposed for the three major classes of rules on the basis of the proposed theorems. The thesis also looks at the problem of rule mining where user guidance is absent. The thesis proposes a guided rule mining system to overcome this problem. The thesis extends this work further by comparing the performance of the algorithm used in the proposed guided rule mining system with Apriori data mining algorithm. Finally, the thesis studies the Kohonen self-organization map as an unsupervised neural network for rule mining algorithms. Two approaches are adopted based on the way of self-organization maps applied in rule mining models. In the first approach, self-organization map is used for clustering, which provides class information to the rule mining process. In the second approach, automated rule mining takes the place of trained neurons as it grows in a hierarchical structure
- ā¦