Search CORE

314 research outputs found

Learning to Play Othello with N-Tuple Systems

Author: Lucas Simon M
Publication venue
Publication date: 01/01/2008
Field of study

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches

University of Essex Research Repository

CiteSeerX

Temporal difference learning with interpolated table value functions

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2009
Field of study

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE

University of Essex Research Repository

Crossref

Investigating learning rates for evolution and temporal difference learning

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE

University of Essex Research Repository

Crossref

Menjana pemodulatan lebar denyut (PWM) penyongsang tiga fasa menggunakan pemproses isyarat digital (DSP)

Author: Hashim Nor Hasyiemah
Publication venue
Publication date: 01/01/2013
Field of study

Baru-baru ini, penyongsang digunakan secara meluas dalam aplikasi industri. Walaubagaimanapun, teknik Pemodulatan Lebar Denyut (PWM) diperlukan untuk mengawal voltan keluaran dan frekuensi penyongsang. Dalam tesis ini, untuk Pemodulatan Lebar Denyut Sinus Unipolar (SPWM) penyongsang tiga fasa adalah dicadang menggunakan Pemproses Isyarat Digital (DSP). Satu model simulasi menggunakan MATLAB Simulink dibangunkan untuk menentukan program Pemodulatan Lebar Denyut Sinus Unipolar (SPWM) Program ini kemudian dibangunkan dalam Pemproses Isyarat Digital (DSP) TMS320f28335. Hasilnya menunjukkan bahawa voltan keluaran penyongsang tiga fasa boleh dikendalikan

UTHM Institutional Repository

Neuroevolution in Games: State of the Art and Open Challenges

Author: Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper surveys research on applying neuroevolution (NE) to games. In neuroevolution, artificial neural networks are trained through evolutionary algorithms, taking inspiration from the way biological brains evolved. We analyse the application of NE in games along five different axes, which are the role NE is chosen to play in a game, the different types of neural networks used, the way these networks are evolved, how the fitness is determined and what type of input the network receives. The article also highlights important open research challenges in the field.Comment: - Added more references - Corrected typos - Added an overview table (Table 1

arXiv.org e-Print Archive

CiteSeerX

Crossref

The IT University of Copenhagen's Repository

Reinforcement learning and its application to Othello

Author: Eck N.J.P. (Nees Jan) van
Wezel M.C. (Michiel) van
Publication venue
Publication date: 01/01/2005
Field of study

In this article we describe reinforcement learning, a machine learning technique for solving sequential decision problems. We describe how reinforcement learning can be combined with function approximation to get approximate solutions for problems with very large state spaces. One such problem is the board game Othello, with a state space size of approximately 1028. We apply reinforcement learning to this problem via a computer program that learns a strategy (or policy) for Othello by playing against itself. The reinforcement learning policy is evaluated against two standard strategies taken from the literature with favorable results. We contrast reinforcement learning with standard methods for solving sequential decision problems and give some examples of applications of reinforcement learning in operations research and management science from the literature

EUR Research Repository

Erasmus University Digital Repository

Evolutionary Algorithms for Reinforcement Learning

Author: Grefenstette J. J.
Moriarty D. E.
Schultz A. C.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications

arXiv.org e-Print Archive

Crossref

Approximating n-player behavioural strategy nash equilibria using coevolution

Author: Lucas Simon
Samothrakis Spyridon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

Coevolutionary algorithms are plagued with a set of problems related to intransitivity that make it questionable what the end product of a coevolutionary run can achieve. With the introduction of solution concepts into coevolution, part of the issue was alleviated, however efficiently representing and achieving game theoretic solution concepts is still not a trivial task. In this paper we propose a coevolutionary algorithm that approximates behavioural strategy Nash equilibria in n-player zero sum games, by exploiting the minimax solution concept. In order to support our case we provide a set of experiments in both games of known and unknown equilibria. In the case of known equilibria, we can confirm our algorithm converges to the known solution, while in the case of unknown equilibria we can see a steady progress towards Nash. Copyright 2011 ACM

University of Essex Research Repository

Crossref

A Multi-Agent Approach to the Game of Go Using Genetic Algorithms

Author: Agah Arvin
Blackman Todd
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 29/01/2016
Field of study

This is the published version. Copyright De GruyterMany researchers have written or attempted to write programs that play the ancient Chinese board game called Go. Although some programs play the game quite well compared with beginners, few play extremely well, and none of the best programs rely on soft computing artificial intelligence techniques like genetic algorithms or neural networks. This paper explores the advantages and possibilities of using genetic algorithms to evolve a multiagent Go player. We show that although individual agents may play poorly, collectively the agents working together play the game significantly better

KU ScholarWorks