15 research outputs found
A Compass to Guide Genetic Algorithms
Parameter control is a key issue to enhance performances of Genetic Algorithms (GA). Although many studies exist on this problem, it is rarely addressed in a general way. Consequently, in practice, parameters are often adjusted manually. Some generic approaches have been experimented by looking at the recent improvements provided by the operators. In this paper, we extend this approach by including operators’ effect over population diversity and computation time. Our controller, named Compass, provides an abstraction of GA’s parameters that allows the user to directly adjust the balance between exploration and exploitation of the search space. The approach is then experimented on the resolution of a classic combinatorial problem (SAT)
An Experimental Study of Adaptive Control for Evolutionary Algorithms
The balance of exploration versus exploitation (EvE) is a key issue on
evolutionary computation. In this paper we will investigate how an adaptive
controller aimed to perform Operator Selection can be used to dynamically
manage the EvE balance required by the search, showing that the search
strategies determined by this control paradigm lead to an improvement of
solution quality found by the evolutionary algorithm
Reinforcement Learning for Mutation Operator Selection in Automated Program Repair
Automated program repair techniques aim to aid software developers with the
challenging task of fixing bugs. In heuristic-based program repair, a search
space of program variants is created by applying mutation operations on the
source code to find potential patches for bugs. Most commonly, every selection
of a mutation operator during search is performed uniformly at random. The
inefficiency of this critical step in the search creates many variants that do
not compile or break intended functionality, wasting considerable resources as
a result. In this paper, we address this issue and propose a reinforcement
learning-based approach to optimise the selection of mutation operators in
heuristic-based program repair. Our solution is programming language,
granularity-level, and search strategy agnostic and allows for easy
augmentation into existing heuristic-based repair tools. We conduct extensive
experimentation on four operator selection techniques, two reward types, two
credit assignment strategies, two integration methods, and three sets of
mutation operators using 22,300 independent repair attempts. We evaluate our
approach on 353 real-world bugs from the Defects4J benchmark. Results show that
the epsilon-greedy multi-armed bandit algorithm with average credit assignment
is best for mutation operator selection. Our approach exhibits a 17.3%
improvement upon the baseline, by generating patches for 9 additional bugs for
a total of 61 patched bugs in the Defects4J benchmark
Adaptive multiple crossover genetic algorithm to solve workforce scheduling and routing problem
The Workforce Scheduling and Routing Problem refers to the assignment of personnel to visits, across various geographical locations. Solving this problem demands tackling numerous scheduling and routing constraints while aiming to minimise the operational cost. One of the main obstacles in designing a genetic algorithm for this problem is selecting the best set of operators that enable better performance in a Genetic Algorithm (GA). This paper presents an adaptive multiple crossover genetic algorithm to tackle the combined setting of scheduling and routing problems. A mix of problem-specific and traditional crossovers are evaluated by using an online learning process to measure the operator's effectiveness. Best performing operators are given high application rates and low rates are given to the worse performing ones. Application rates are dynamically adjusted according to the learning outcomes in a non-stationary environment. Experimental results show that the combined performances of all the operators works better than using one operator in isolation. This study makes a contribution to advance our understanding of how to make effective use of crossover operators on this highly-constrained optimisation problem
Intelligent data mining using artificial neural networks and genetic algorithms : techniques and applications
Data Mining (DM) refers to the analysis of observational datasets to find
relationships and to summarize the data in ways that are both understandable
and useful. Many DM techniques exist. Compared with other DM techniques,
Intelligent Systems (ISs) based approaches, which include Artificial Neural
Networks (ANNs), fuzzy set theory, approximate reasoning, and derivative-free
optimization methods such as Genetic Algorithms (GAs), are tolerant of
imprecision, uncertainty, partial truth, and approximation. They provide
flexible information processing capability for handling real-life situations. This
thesis is concerned with the ideas behind design, implementation, testing and
application of a novel ISs based DM technique. The unique contribution of this
thesis is in the implementation of a hybrid IS DM technique (Genetic Neural
Mathematical Method, GNMM) for solving novel practical problems, the
detailed description of this technique, and the illustrations of several
applications solved by this novel technique.
GNMM consists of three steps: (1) GA-based input variable selection, (2) Multi-
Layer Perceptron (MLP) modelling, and (3) mathematical programming based
rule extraction. In the first step, GAs are used to evolve an optimal set of MLP
inputs. An adaptive method based on the average fitness of successive
generations is used to adjust the mutation rate, and hence the
exploration/exploitation balance. In addition, GNMM uses the elite group and
appearance percentage to minimize the randomness associated with GAs. In
the second step, MLP modelling serves as the core DM engine in performing
classification/prediction tasks. An Independent Component Analysis (ICA)
based weight initialization algorithm is used to determine optimal weights
before the commencement of training algorithms. The Levenberg-Marquardt
(LM) algorithm is used to achieve a second-order speedup compared to
conventional Back-Propagation (BP) training. In the third step, mathematical
programming based rule extraction is not only used to identify the premises of
multivariate polynomial rules, but also to explore features from the extracted
rules based on data samples associated with each rule. Therefore, the
methodology can provide regression rules and features not only in the
polyhedrons with data instances, but also in the polyhedrons without data
instances.
A total of six datasets from environmental and medical disciplines were used
as case study applications. These datasets involve the prediction of
longitudinal dispersion coefficient, classification of electrocorticography
(ECoG)/Electroencephalogram (EEG) data, eye bacteria Multisensor Data
Fusion (MDF), and diabetes classification (denoted by Data I through to Data VI). GNMM was applied to all these six datasets to explore its effectiveness,
but the emphasis is different for different datasets. For example, the emphasis
of Data I and II was to give a detailed illustration of how GNMM works; Data III
and IV aimed to show how to deal with difficult classification problems; the
aim of Data V was to illustrate the averaging effect of GNMM; and finally Data
VI was concerned with the GA parameter selection and benchmarking GNMM
with other IS DM techniques such as Adaptive Neuro-Fuzzy Inference System
(ANFIS), Evolving Fuzzy Neural Network (EFuNN), Fuzzy ARTMAP, and
Cartesian Genetic Programming (CGP). In addition, datasets obtained from
published works (i.e. Data II & III) or public domains (i.e. Data VI) where
previous results were present in the literature were also used to benchmark
GNMM’s effectiveness.
As a closely integrated system GNMM has the merit that it needs little human
interaction. With some predefined parameters, such as GA’s crossover
probability and the shape of ANNs’ activation functions, GNMM is able to
process raw data until some human-interpretable rules being extracted. This is
an important feature in terms of practice as quite often users of a DM system
have little or no need to fully understand the internal components of such a
system. Through case study applications, it has been shown that the GA-based
variable selection stage is capable of: filtering out irrelevant and noisy
variables, improving the accuracy of the model; making the ANN structure less
complex and easier to understand; and reducing the computational complexity
and memory requirements. Furthermore, rule extraction ensures that the MLP
training results are easily understandable and transferrable
Efficient learning methods to tune algorithm parameters
This thesis focuses on the algorithm configuration problem. In particular, three efficient
learning configurators are introduced to tune parameters offline. The first looks into metaoptimization,
where the algorithm is expected to solve similar problem instances within
varying computational budgets. Standard meta-optimization techniques have to be repeated
whenever the available computational budget changes, as the parameters that work well for
small budgets, may not be suitable for larger ones. The proposed Flexible Budget method
can, in a single run, identify the best parameter setting for all possible computational
budgets less than a specified maximum, without compromising solution quality. Hence, a lot
of time is saved. This will be shown experimentally. The second regards Racing algorithms
which often do not fully utilize the available computational budget to find the best parameter
setting, as they may terminate whenever a single parameter remains in the race. The
proposed Racing with reset can overcome this issue, and at the same time adapt Racing’s
hyper-parameter α online. Experiments will show that such adaptation enables the algorithm
to achieve significantly lower failure rates, compared to any fixed α set by the user. The
third extends on Racing with reset by allowing it to utilize all the information gathered
previously when it adapts α, it also permits Racing algorithms in general to intelligently
allocate the budget in each iteration, as opposed to equally allocating it. All developed
Racing algorithms are compared to two budget allocators from the Simulation Optimization
literature, OCBA and CBA, and to equal allocation to demonstrate under which conditions
each performs best in terms of minimizing the probability of incorrect selection
Characterising fitness landscapes with fitness-probability cloud and its applications to algorithm configuration
Metaheuristics are approximation optimisation techniques widely applied to solve complex optimisation problems. Despite a large number of developed metaheuristic algorithms, a limited amount of work has been done to understand on which kinds of problems the proposed algorithm will perform well or poorly and why. A useful solution to this dilemma is to use fitness landscape analysis to gain an in-depth understanding of which algorithms, or algorithm variants are best suited for solving which kinds of problem instances, even to dynamically determine the best algorithm configuration during different stages of a search algorithm.
This thesis for the first time bridges the gap between fitness landscape analysis and algorithm configuration, i.e., finding the best suited configuration of a given algorithm for solving a particular problem instance. Studies in this thesis contribute to the following:
a. Developing a novel and effective approach to characterise fitness landscapes and measure problem difficulty with respect to algorithms.
b. Incorporating fitness landscape analysis in building a generic (problem-independent) approach, which can perform automatic algorithm configuration on a per-instance base, and in designing novel and effective algorithm configurations.
c. Incorporating fitness landscape analysis in establishing a generic framework for designing adaptive heuristic algorithms