1,042 research outputs found

    An Experimental Study of Adaptive Control for Evolutionary Algorithms

    Get PDF
    The balance of exploration versus exploitation (EvE) is a key issue on evolutionary computation. In this paper we will investigate how an adaptive controller aimed to perform Operator Selection can be used to dynamically manage the EvE balance required by the search, showing that the search strategies determined by this control paradigm lead to an improvement of solution quality found by the evolutionary algorithm

    Anterior Prefrontal Cortex Contributes to Action Selection through Tracking of Recent Reward Trends

    Get PDF
    The functions of prefrontal cortex remain enigmatic, especially for its anterior sectors, putatively ranging from planning to self-initiated behavior, social cognition, task switching, and memory. A predominant current theory regarding the most anterior sector, the frontopolar cortex (FPC), is that it is involved in exploring alternative courses of action, but the detailed causal mechanisms remain unknown. Here we investigated this issue using the lesion method, together with a novel model-based analysis. Eight patients with anterior prefrontal brain lesions including the FPC performed a four-armed bandit task known from neuroimaging studies to activate the FPC. Model-based analyses of learning demonstrated a selective deficit in the ability to extrapolate the most recent trend, despite an intact general ability to learn from past rewards. Whereas both brain-damaged and healthy controls used comparisons between the two most recent choice outcomes to infer trends that influenced their decision about the next choice, the group with anterior prefrontal lesions showed a complete absence of this component and instead based their choice entirely on the cumulative reward history. Given that the FPC is thought to be the most evolutionarily recent expansion of primate prefrontal cortex, we suggest that its function may reflect uniquely human adaptations to select and update models of reward contingency in dynamic environments

    Algorithm Portfolios for Noisy Optimization

    Get PDF
    Noisy optimization is the optimization of objective functions corrupted by noise. A portfolio of solvers is a set of solvers equipped with an algorithm selection tool for distributing the computational power among them. Portfolios are widely and successfully used in combinatorial optimization. In this work, we study portfolios of noisy optimization solvers. We obtain mathematically proved performance (in the sense that the portfolio performs nearly as well as the best of its solvers) by an ad hoc portfolio algorithm dedicated to noisy optimization. A somehow surprising result is that it is better to compare solvers with some lag, i.e., propose the current recommendation of best solver based on their performance earlier in the run. An additional finding is a principled method for distributing the computational power among solvers in the portfolio.Comment: in Annals of Mathematics and Artificial Intelligence, Springer Verlag, 201

    Reinforcement Learning for Mutation Operator Selection in Automated Program Repair

    Full text link
    Automated program repair techniques aim to aid software developers with the challenging task of fixing bugs. In heuristic-based program repair, a search space of program variants is created by applying mutation operations on the source code to find potential patches for bugs. Most commonly, every selection of a mutation operator during search is performed uniformly at random. The inefficiency of this critical step in the search creates many variants that do not compile or break intended functionality, wasting considerable resources as a result. In this paper, we address this issue and propose a reinforcement learning-based approach to optimise the selection of mutation operators in heuristic-based program repair. Our solution is programming language, granularity-level, and search strategy agnostic and allows for easy augmentation into existing heuristic-based repair tools. We conduct extensive experimentation on four operator selection techniques, two reward types, two credit assignment strategies, two integration methods, and three sets of mutation operators using 22,300 independent repair attempts. We evaluate our approach on 353 real-world bugs from the Defects4J benchmark. Results show that the epsilon-greedy multi-armed bandit algorithm with average credit assignment is best for mutation operator selection. Our approach exhibits a 17.3% improvement upon the baseline, by generating patches for 9 additional bugs for a total of 61 patched bugs in the Defects4J benchmark

    Hyperparameter Tuning in Bandit-Based Adaptive Operator Selection

    Get PDF
    EvoApplications 2012: EvoCOMNET, EvoCOMPLEX, EvoFIN, EvoGAMES, EvoHOT, EvoIASP, EvoNUM, EvoPAR, EvoRISK, EvoSTIM, and EvoSTOC, Málaga, Spain, April 11-13, 2012, ProceedingsWe are using bandit-based adaptive operator selection while autotuning parallel computer programs. The autotuning, which uses evolutionary algorithm-based stochastic sampling, takes place over an extended duration and occurs in situ as programs execute. The environment or context during tuning is either largely static in one scenario or dynamic in another. We rely upon adaptive operator selection to dynamically generate worthy test configurations of the program. In this paper, we study how the choice of hyperparameters, which control the trade-off between exploration and exploitation, affects the effectiveness of adaptive operator selection which in turn affects the performance of the autotuner. We show that while the optimal assignment of hyperparameters varies greatly between different benchmarks, there exists a single assignment, for a context, of hyperparameters that performs well regardless of the program being tuned

    A self-learning particle swarm optimizer for global optimization problems

    Get PDF
    Copyright @ 2011 IEEE. All Rights Reserved. This article was made available through the Brunel Open Access Publishing Fund.Particle swarm optimization (PSO) has been shown as an effective tool for solving global optimization problems. So far, most PSO algorithms use a single learning pattern for all particles, which means that all particles in a swarm use the same strategy. This monotonic learning pattern may cause the lack of intelligence for a particular particle, which makes it unable to deal with different complex situations. This paper presents a novel algorithm, called self-learning particle swarm optimizer (SLPSO), for global optimization problems. In SLPSO, each particle has a set of four strategies to cope with different situations in the search space. The cooperation of the four strategies is implemented by an adaptive learning framework at the individual level, which can enable a particle to choose the optimal strategy according to its own local fitness landscape. The experimental study on a set of 45 test functions and two real-world problems show that SLPSO has a superior performance in comparison with several other peer algorithms.This work was supported by the Engineering and Physical Sciences Research Council of U.K. under Grants EP/E060722/1 and EP/E060722/2

    Bandit-based Random Mutation Hill-Climbing

    Get PDF
    The Random Mutation Hill-Climbing algorithm is a direct search technique mostly used in discrete domains. It repeats the process of randomly selecting a neighbour of a best-so-far solution and accepts the neighbour if it is better than or equal to it. In this work, we propose to use a novel method to select the neighbour solution using a set of independent multi-armed bandit-style selection units which results in a bandit-based Random Mutation Hill-Climbing algorithm. The new algorithm significantly outperforms Random Mutation Hill-Climbing in both OneMax (in noise-free and noisy cases) and Royal Road problems (in the noise-free case). The algorithm shows particular promise for discrete optimisation problems where each fitness evaluation is expensive
    corecore