Search CORE

8 research outputs found

Investigating evolutionary checkers by incorporating individual and social learning, N-tuple systems and a round robin tournament

Author: Al-Khateeb Belal
Publication venue
Publication date: 13/07/2011
Field of study

In recent years, much research attention has been paid to evolving self-learning game players. Fogel's Blondie24 is just one demonstration of a real success in this field and it has inspired many other scientists. In this thesis, artificial neural networks are employed to evolve game playing strategies for the game of checkers by introducing a league structure into the learning phase of a system based on Blondie24. We believe that this helps eliminate some of the randomness in the evolution. The best player obtained is tested against an evolutionary checkers program based on Blondie24. The results obtained are promising. In addition, we introduce an individual and social learning mechanism into the learning phase of the evolutionary checkers system. The best player obtained is tested against an implementation of an evolutionary checkers program, and also against a player, which utilises a round robin tournament. The results are promising. N-tuple systems are also investigated and are used as position value functions for the game of checkers. The architecture of the n-tuple is utilises temporal difference learning. The best player obtained is compared with an implementation of evolutionary checkers program based on Blondie24, and also against a Blondie24 inspired player, which utilises a round robin tournament. The results are promising. We also address the question of whether piece difference and the look-ahead depth are important factors in the Blondie24 architecture. Our experiments show that piece difference and the look-ahead depth have a significant effect on learning abilities

Nottingham eTheses

Investigating evolutionary checkers by incorporating individual and social learning, N-tuple systems and a round robin tournament

Author: Al-Khateeb Belal
Publication venue
Publication date
Field of study

Nottingham ePrints

Temporal difference learning with interpolated table value functions

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2009
Field of study

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE

University of Essex Research Repository

Crossref

Investigating learning rates for evolution and temporal difference learning

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE

University of Essex Research Repository

Crossref

Elicitation of strategies in four variants of a round-robin tournament: the case of Goofspiel

Author: Dror Moshe
Kendall G.
Rapoport Amnon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2016
Field of study

Goofspiel is a simple two-person zero-sum game for which there exist no known equilibrium strategies. To gain insight into what constitute winning strategies, we conducted a round-robin tournament in which participants were asked to provide computerized programs for playing the game with or without carryover. Each of these two variants was to be played under two quite different objective functions, namely, maximization of the cumulative number of points won across all opponents (as in Axelrod's tournament), and maximization of the probability of winning any given round. Our results show that there are, indeed, inherent differences in the results with respect to the complexity of the game and its objective function, and that winning strategies exhibit a level of sophistication, depth, and balance that are not captured by present models of adaptive learning

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

PSO-based coevolutionary Game Learning

Author: Franken Cornelis J.
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 07/12/2004
Field of study

Games have been investigated as computationally complex problems since the inception of artificial intelligence in the 1950’s. Originally, search-based techniques were applied to create a competent (and sometimes even expert) game player. The search-based techniques, such as game trees, made use of human-defined knowledge to evaluate the current game state and recommend the best move to make next. Recent research has shown that neural networks can be evolved as game state evaluators, thereby removing the human intelligence factor completely. This study builds on the initial research that made use of evolutionary programming to evolve neural networks in the game learning domain. Particle Swarm Optimisation (PSO) is applied inside a coevolutionary training environment to evolve the weights of the neural network. The training technique is applied to both the zero sum and non-zero sum game domains, with specific application to Tic-Tac-Toe, Checkers and the Iterated Prisoners Dilemma (IPD). The influence of the various PSO parameters on playing performance are experimentally examined, and the overall performance of three different neighbourhood information sharing structures compared. A new coevolutionary scoring scheme and particle dispersement operator are defined, inspired by Formula One Grand Prix racing. Finally, the PSO is applied in three novel ways to evolve strategies for the IPD – the first application of its kind in the PSO field. The PSO-based coevolutionary learning technique described and examined in this study shows promise in evolving intelligent evaluators for the aforementioned games, and further study will be conducted to analyse its scalability to larger search spaces and games of varying complexity.Dissertation (MSc)--University of Pretoria, 2005.Computer Scienceunrestricte

UPSpace at the University of Pretoria

A learning framework for zero-knowledge game playing agents

Author: Duminy Willem Harklaas
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 17/10/2007
Field of study

The subjects of perfect information games, machine learning and computational intelligence combine in an experiment that investigates a method to build the skill of a game-playing agent from zero game knowledge. The skill of a playing agent is determined by two aspects, the first is the quantity and quality of the knowledge it uses and the second aspect is its search capacity. This thesis introduces a novel representation language that combines symbols and numeric elements to capture game knowledge. Insofar search is concerned; an extension to an existing knowledge-based search method is developed. Empirical tests show an improvement over alpha-beta, especially in learning conditions where the knowledge may be weak. Current machine learning techniques as applied to game agents is reviewed. From these techniques a learning framework is established. The data-mining algorithm, ID3, and the computational intelligence technique, Particle Swarm Optimisation (PSO), form the key learning components of this framework. The classification trees produced by ID3 are subjected to new post-pruning processes specifically defined for the mentioned representation language. Different combinations of these pruning processes are tested and a dominant combination is chosen for use in the learning framework. As an extension to PSO, tournaments are introduced as a relative fitness function. A variety of alternative tournament methods are described and some experiments are conducted to evaluate these. The final design decisions are incorporated into the learning frame-work configuration, and learning experiments are conducted on Checkers and some variations of Checkers. These experiments show that learning has occurred, but also highlights the need for further development and experimentation. Some ideas in this regard conclude the thesis.Dissertation (MSc)--University of Pretoria, 2007.Computer ScienceMScUnrestricte

UPSpace at the University of Pretoria