1,042 research outputs found
An Experimental Study of Adaptive Control for Evolutionary Algorithms
The balance of exploration versus exploitation (EvE) is a key issue on
evolutionary computation. In this paper we will investigate how an adaptive
controller aimed to perform Operator Selection can be used to dynamically
manage the EvE balance required by the search, showing that the search
strategies determined by this control paradigm lead to an improvement of
solution quality found by the evolutionary algorithm
Anterior Prefrontal Cortex Contributes to Action Selection through Tracking of Recent Reward Trends
The functions of prefrontal cortex remain enigmatic, especially for its anterior sectors, putatively ranging from planning to self-initiated behavior, social cognition, task switching, and memory. A predominant current theory regarding the most anterior sector, the frontopolar cortex (FPC), is that it is involved in exploring alternative courses of action, but the detailed causal mechanisms remain unknown. Here we investigated this issue using the lesion method, together with a novel model-based analysis. Eight patients with anterior prefrontal brain lesions including the FPC performed a four-armed bandit task known from neuroimaging studies to activate the FPC. Model-based analyses of learning demonstrated a selective deficit in the ability to extrapolate the most recent trend, despite an intact general ability to learn from past rewards. Whereas both brain-damaged and healthy controls used comparisons between the two most recent choice outcomes to infer trends that influenced their decision about the next choice, the group with anterior prefrontal lesions showed a complete absence of this component and instead based their choice entirely on the cumulative reward history. Given that the FPC is thought to be the most evolutionarily recent expansion of primate prefrontal cortex, we suggest that its function may reflect uniquely human adaptations to select and update models of reward contingency in dynamic environments
Algorithm Portfolios for Noisy Optimization
Noisy optimization is the optimization of objective functions corrupted by
noise. A portfolio of solvers is a set of solvers equipped with an algorithm
selection tool for distributing the computational power among them. Portfolios
are widely and successfully used in combinatorial optimization. In this work,
we study portfolios of noisy optimization solvers. We obtain mathematically
proved performance (in the sense that the portfolio performs nearly as well as
the best of its solvers) by an ad hoc portfolio algorithm dedicated to noisy
optimization. A somehow surprising result is that it is better to compare
solvers with some lag, i.e., propose the current recommendation of best solver
based on their performance earlier in the run. An additional finding is a
principled method for distributing the computational power among solvers in the
portfolio.Comment: in Annals of Mathematics and Artificial Intelligence, Springer
Verlag, 201
Reinforcement Learning for Mutation Operator Selection in Automated Program Repair
Automated program repair techniques aim to aid software developers with the
challenging task of fixing bugs. In heuristic-based program repair, a search
space of program variants is created by applying mutation operations on the
source code to find potential patches for bugs. Most commonly, every selection
of a mutation operator during search is performed uniformly at random. The
inefficiency of this critical step in the search creates many variants that do
not compile or break intended functionality, wasting considerable resources as
a result. In this paper, we address this issue and propose a reinforcement
learning-based approach to optimise the selection of mutation operators in
heuristic-based program repair. Our solution is programming language,
granularity-level, and search strategy agnostic and allows for easy
augmentation into existing heuristic-based repair tools. We conduct extensive
experimentation on four operator selection techniques, two reward types, two
credit assignment strategies, two integration methods, and three sets of
mutation operators using 22,300 independent repair attempts. We evaluate our
approach on 353 real-world bugs from the Defects4J benchmark. Results show that
the epsilon-greedy multi-armed bandit algorithm with average credit assignment
is best for mutation operator selection. Our approach exhibits a 17.3%
improvement upon the baseline, by generating patches for 9 additional bugs for
a total of 61 patched bugs in the Defects4J benchmark
Hyperparameter Tuning in Bandit-Based Adaptive Operator Selection
EvoApplications 2012: EvoCOMNET, EvoCOMPLEX, EvoFIN, EvoGAMES, EvoHOT, EvoIASP, EvoNUM, EvoPAR, EvoRISK, EvoSTIM, and EvoSTOC, Málaga, Spain, April 11-13, 2012, ProceedingsWe are using bandit-based adaptive operator selection while autotuning parallel computer programs. The autotuning, which uses evolutionary algorithm-based stochastic sampling, takes place over an extended duration and occurs in situ as programs execute. The environment or context during tuning is either largely static in one scenario or dynamic in another. We rely upon adaptive operator selection to dynamically generate worthy test configurations of the program. In this paper, we study how the choice of hyperparameters, which control the trade-off between exploration and exploitation, affects the effectiveness of adaptive operator selection which in turn affects the performance of the autotuner. We show that while the optimal assignment of hyperparameters varies greatly between different benchmarks, there exists a single assignment, for a context, of hyperparameters that performs well regardless of the program being tuned
A self-learning particle swarm optimizer for global optimization problems
Copyright @ 2011 IEEE. All Rights Reserved. This article was made available through the Brunel Open Access Publishing Fund.Particle swarm optimization (PSO) has been shown as an effective tool for solving global optimization problems. So far, most PSO algorithms use a single learning pattern for all particles, which means that all particles in a swarm use the same strategy. This monotonic learning pattern may cause the lack of intelligence for a particular particle, which makes it unable to deal with different complex situations. This paper presents a novel algorithm, called self-learning particle swarm optimizer (SLPSO), for global optimization problems. In SLPSO, each particle has a set of four strategies to cope with different situations in the search space. The cooperation of the four strategies is implemented by an adaptive learning framework at the individual level, which can enable a particle to choose the optimal strategy according to its own local fitness landscape. The experimental study on a set of 45 test functions and two real-world problems show that SLPSO has a superior performance in comparison with several other peer algorithms.This work was supported by the Engineering and Physical Sciences Research Council of U.K. under Grants EP/E060722/1 and EP/E060722/2
Bandit-based Random Mutation Hill-Climbing
The Random Mutation Hill-Climbing algorithm is a direct search technique mostly used in discrete domains. It repeats the process of randomly selecting a neighbour of a best-so-far solution and accepts the neighbour if it is better than or equal to it. In this work, we propose to use a novel method to select the neighbour solution using a set of independent multi-armed bandit-style selection units which results in a bandit-based Random Mutation Hill-Climbing algorithm. The new algorithm significantly outperforms Random Mutation Hill-Climbing in both OneMax (in noise-free and noisy cases) and Royal Road problems (in the noise-free case). The algorithm shows particular promise for discrete optimisation problems where each fitness evaluation is expensive
- …