6 research outputs found

    Learning Team Strategies: Soccer Case Studies

    No full text
    We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy, but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare several learning algorithms: TD-Q learning with linear neural networks (TD-Q), Probabilistic Incremental Program Evolution (PIPE), and a PIPE version that learns by coevolution (CO-PIPE). TD-Q is based on learning evaluation functions (EFs) mapping input/action pairs to expected reward. PIPE and CO-PIPE search policy space directly. They use adaptive probability distributions to synthesize programs that calculate action probabilities from current inputs. Our results show that linear TD-Q encounters several difficulties in learning appropriate shared EFs. PIPE and CO-PIPE, however, do not depend on EFs and find good policies faster and more reliably. This suggests that in some multiagent learning scenarios direct search in policy space can offer advantages over EF-based approaches

    A linear estimation-of-distribution gp system

    No full text
    Abstract. We present N-gram GP, an estimation of distribution algorithm for the evolution of linear computer programs. The algorithm learns and samples a joint probability distribution of triplets of instructions (or 3-grams) at the same time as it is learning and sampling a program length distribution. We have tested N-gram GP on symbolic regressions problems where the target function is a polynomial of up to degree 12 and lawn-mower problems with lawn sizes of up to 12 Ă— 12. Results show that the algorithm is effective and scales better on these problems than either linear GP or simple stochastic hill-climbing.

    Memetic algorithms: The polynomial local search complexity theory perspective

    No full text
    In previous work (Krasnogor, http://www.cs.nott.ac.uk/~nxk/papers.html . In: Studies on the Theory and Design Space of Memetic Algorithms. Ph.D. thesis, University of the West of England, Bristol, UK, 2002; Krasnogor and Smith, IEEE Trans Evol Algorithms 9(6):474-488, 2005) we develop a syntax-only classification of evolutionary algorithms, in particular so-called memetic algorithms (MAs). When "syntactic sugar" is added to our model, we are able to investigate the polynomial local search (PLS) complexity of memetic algorithms. In this paper we show the PLS-completeness of whole classes of problems that occur when memetic algorithms are applied to the travelling salesman problem using a range of mutation, crossover and local search operators. Our PLS-completeness results shed light on the worst case behaviour that can be expected of a memetic algorithm under these circumstances. Moreover, we point out in this paper that memetic algorithms for graph partitioning and maximum network flow (both with important practical applications) also give rise to PLS-complete problems. © 2007 Springer Science + Business Media B.V
    corecore