    A Compass to Guide Genetic Algorithms

    Parameter control is a key issue to enhance performances of Genetic Algorithms (GA). Although many studies exist on this problem, it is rarely addressed in a general way. Consequently, in practice, parameters are often adjusted manually. Some generic approaches have been experimented by looking at the recent improvements provided by the operators. In this paper, we extend this approach by including operators’ effect over population diversity and computation time. Our controller, named Compass, provides an abstraction of GA’s parameters that allows the user to directly adjust the balance between exploration and exploitation of the search space. The approach is then experimented on the resolution of a classic combinatorial problem (SAT)

    An Experimental Study of Adaptive Control for Evolutionary Algorithms

    The balance of exploration versus exploitation (EvE) is a key issue on evolutionary computation. In this paper we will investigate how an adaptive controller aimed to perform Operator Selection can be used to dynamically manage the EvE balance required by the search, showing that the search strategies determined by this control paradigm lead to an improvement of solution quality found by the evolutionary algorithm

    Reinforcement Learning for Mutation Operator Selection in Automated Program Repair

    Automated program repair techniques aim to aid software developers with the challenging task of fixing bugs. In heuristic-based program repair, a search space of program variants is created by applying mutation operations on the source code to find potential patches for bugs. Most commonly, every selection of a mutation operator during search is performed uniformly at random. The inefficiency of this critical step in the search creates many variants that do not compile or break intended functionality, wasting considerable resources as a result. In this paper, we address this issue and propose a reinforcement learning-based approach to optimise the selection of mutation operators in heuristic-based program repair. Our solution is programming language, granularity-level, and search strategy agnostic and allows for easy augmentation into existing heuristic-based repair tools. We conduct extensive experimentation on four operator selection techniques, two reward types, two credit assignment strategies, two integration methods, and three sets of mutation operators using 22,300 independent repair attempts. We evaluate our approach on 353 real-world bugs from the Defects4J benchmark. Results show that the epsilon-greedy multi-armed bandit algorithm with average credit assignment is best for mutation operator selection. Our approach exhibits a 17.3% improvement upon the baseline, by generating patches for 9 additional bugs for a total of 61 patched bugs in the Defects4J benchmark

    Adaptive multiple crossover genetic algorithm to solve workforce scheduling and routing problem

    The Workforce Scheduling and Routing Problem refers to the assignment of personnel to visits, across various geographical locations. Solving this problem demands tackling numerous scheduling and routing constraints while aiming to minimise the operational cost. One of the main obstacles in designing a genetic algorithm for this problem is selecting the best set of operators that enable better performance in a Genetic Algorithm (GA). This paper presents an adaptive multiple crossover genetic algorithm to tackle the combined setting of scheduling and routing problems. A mix of problem-specific and traditional crossovers are evaluated by using an online learning process to measure the operator's effectiveness. Best performing operators are given high application rates and low rates are given to the worse performing ones. Application rates are dynamically adjusted according to the learning outcomes in a non-stationary environment. Experimental results show that the combined performances of all the operators works better than using one operator in isolation. This study makes a contribution to advance our understanding of how to make effective use of crossover operators on this highly-constrained optimisation problem

    Intelligent data mining using artificial neural networks and genetic algorithms : techniques and applications

    Data Mining (DM) refers to the analysis of observational datasets to find relationships and to summarize the data in ways that are both understandable and useful. Many DM techniques exist. Compared with other DM techniques, Intelligent Systems (ISs) based approaches, which include Artificial Neural Networks (ANNs), fuzzy set theory, approximate reasoning, and derivative-free optimization methods such as Genetic Algorithms (GAs), are tolerant of imprecision, uncertainty, partial truth, and approximation. They provide flexible information processing capability for handling real-life situations. This thesis is concerned with the ideas behind design, implementation, testing and application of a novel ISs based DM technique. The unique contribution of this thesis is in the implementation of a hybrid IS DM technique (Genetic Neural Mathematical Method, GNMM) for solving novel practical problems, the detailed description of this technique, and the illustrations of several applications solved by this novel technique. GNMM consists of three steps: (1) GA-based input variable selection, (2) Multi- Layer Perceptron (MLP) modelling, and (3) mathematical programming based rule extraction. In the first step, GAs are used to evolve an optimal set of MLP inputs. An adaptive method based on the average fitness of successive generations is used to adjust the mutation rate, and hence the exploration/exploitation balance. In addition, GNMM uses the elite group and appearance percentage to minimize the randomness associated with GAs. In the second step, MLP modelling serves as the core DM engine in performing classification/prediction tasks. An Independent Component Analysis (ICA) based weight initialization algorithm is used to determine optimal weights before the commencement of training algorithms. The Levenberg-Marquardt (LM) algorithm is used to achieve a second-order speedup compared to conventional Back-Propagation (BP) training. In the third step, mathematical programming based rule extraction is not only used to identify the premises of multivariate polynomial rules, but also to explore features from the extracted rules based on data samples associated with each rule. Therefore, the methodology can provide regression rules and features not only in the polyhedrons with data instances, but also in the polyhedrons without data instances. A total of six datasets from environmental and medical disciplines were used as case study applications. These datasets involve the prediction of longitudinal dispersion coefficient, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) data, eye bacteria Multisensor Data Fusion (MDF), and diabetes classification (denoted by Data I through to Data VI). GNMM was applied to all these six datasets to explore its effectiveness, but the emphasis is different for different datasets. For example, the emphasis of Data I and II was to give a detailed illustration of how GNMM works; Data III and IV aimed to show how to deal with difficult classification problems; the aim of Data V was to illustrate the averaging effect of GNMM; and finally Data VI was concerned with the GA parameter selection and benchmarking GNMM with other IS DM techniques such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Fuzzy ARTMAP, and Cartesian Genetic Programming (CGP). In addition, datasets obtained from published works (i.e. Data II & III) or public domains (i.e. Data VI) where previous results were present in the literature were also used to benchmark GNMM’s effectiveness. As a closely integrated system GNMM has the merit that it needs little human interaction. With some predefined parameters, such as GA’s crossover probability and the shape of ANNs’ activation functions, GNMM is able to process raw data until some human-interpretable rules being extracted. This is an important feature in terms of practice as quite often users of a DM system have little or no need to fully understand the internal components of such a system. Through case study applications, it has been shown that the GA-based variable selection stage is capable of: filtering out irrelevant and noisy variables, improving the accuracy of the model; making the ANN structure less complex and easier to understand; and reducing the computational complexity and memory requirements. Furthermore, rule extraction ensures that the MLP training results are easily understandable and transferrable

    Efficient learning methods to tune algorithm parameters

    This thesis focuses on the algorithm configuration problem. In particular, three efficient learning configurators are introduced to tune parameters offline. The first looks into metaoptimization, where the algorithm is expected to solve similar problem instances within varying computational budgets. Standard meta-optimization techniques have to be repeated whenever the available computational budget changes, as the parameters that work well for small budgets, may not be suitable for larger ones. The proposed Flexible Budget method can, in a single run, identify the best parameter setting for all possible computational budgets less than a specified maximum, without compromising solution quality. Hence, a lot of time is saved. This will be shown experimentally. The second regards Racing algorithms which often do not fully utilize the available computational budget to find the best parameter setting, as they may terminate whenever a single parameter remains in the race. The proposed Racing with reset can overcome this issue, and at the same time adapt Racing’s hyper-parameter α online. Experiments will show that such adaptation enables the algorithm to achieve significantly lower failure rates, compared to any fixed α set by the user. The third extends on Racing with reset by allowing it to utilize all the information gathered previously when it adapts α, it also permits Racing algorithms in general to intelligently allocate the budget in each iteration, as opposed to equally allocating it. All developed Racing algorithms are compared to two budget allocators from the Simulation Optimization literature, OCBA and CBA, and to equal allocation to demonstrate under which conditions each performs best in terms of minimizing the probability of incorrect selection

    Characterising fitness landscapes with fitness-probability cloud and its applications to algorithm configuration

    Metaheuristics are approximation optimisation techniques widely applied to solve complex optimisation problems. Despite a large number of developed metaheuristic algorithms, a limited amount of work has been done to understand on which kinds of problems the proposed algorithm will perform well or poorly and why. A useful solution to this dilemma is to use fitness landscape analysis to gain an in-depth understanding of which algorithms, or algorithm variants are best suited for solving which kinds of problem instances, even to dynamically determine the best algorithm configuration during different stages of a search algorithm. This thesis for the first time bridges the gap between fitness landscape analysis and algorithm configuration, i.e., finding the best suited configuration of a given algorithm for solving a particular problem instance. Studies in this thesis contribute to the following: a. Developing a novel and effective approach to characterise fitness landscapes and measure problem difficulty with respect to algorithms. b. Incorporating fitness landscape analysis in building a generic (problem-independent) approach, which can perform automatic algorithm configuration on a per-instance base, and in designing novel and effective algorithm configurations. c. Incorporating fitness landscape analysis in establishing a generic framework for designing adaptive heuristic algorithms