Search CORE

983 research outputs found

EvoTanks: co-evolutionary development of game-playing agents

Author: Hayes Gillian
Levine John
Thompson Thomas
Publication venue
Publication date: 01/01/2007
Field of study

This paper describes the EvoTanks research project, a continuing attempt to develop strong AI players for a primitive 'Combat' style video game using evolutionary computational methods with artificial neural networks. A small but challenging feat due to the necessity for agent's actions to rely heavily on opponent behaviour. Previous investigation has shown the agents are capable of developing high performance behaviours by evolving against scripted opponents; however these are local to the trained opponent. The focus of this paper shows results from the use of co-evolution on the same population. Results show agents no longer succumb to trappings of local maxima within the search space and are capable of converging on high fitness behaviours local to their population without the use of scripted opponents

GA-Gammon:A Backgammon Player Program Based on Evolutionary Algorithms

Author: Cruz-Cortes Nareli
Irineo-Fuentes Oscar
Larsen Henrik Legind
Ortiz-Arroyo Daniel
Rodriguez-Henriquez Francisco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

VBN

Investigating learning rates for evolution and temporal difference learning

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE

Temporal difference learning with interpolated table value functions

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2009
Field of study

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE

Abalearn: a risk-sensitive approach to self-play learning in Abalone

Author: Campos Pedro
Langlois Thibault
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training.info:eu-repo/semantics/publishedVersio

Autonomous virulence adaptation improves coevolutionary optimization

Author: Ait-Boudaoud D
Cartlidge J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2011
Field of study

Learning to Play Othello with N-Tuple Systems

Author: Lucas Simon M
Publication venue
Publication date: 01/01/2008
Field of study

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches

CiteSeerX