314 research outputs found
Learning to Play Othello with N-Tuple Systems
This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches
Temporal difference learning with interpolated table value functions
This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE
Investigating learning rates for evolution and temporal difference learning
Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE
Menjana pemodulatan lebar denyut (PWM) penyongsang tiga fasa menggunakan pemproses isyarat digital (DSP)
Baru-baru ini, penyongsang digunakan secara meluas dalam aplikasi industri.
Walaubagaimanapun, teknik Pemodulatan Lebar Denyut (PWM) diperlukan untuk
mengawal voltan keluaran dan frekuensi penyongsang. Dalam tesis ini, untuk
Pemodulatan Lebar Denyut Sinus Unipolar (SPWM) penyongsang tiga fasa adalah
dicadang menggunakan Pemproses Isyarat Digital (DSP). Satu model simulasi
menggunakan MATLAB Simulink dibangunkan untuk menentukan program
Pemodulatan Lebar Denyut Sinus Unipolar (SPWM) Program ini kemudian
dibangunkan dalam Pemproses Isyarat Digital (DSP) TMS320f28335. Hasilnya
menunjukkan bahawa voltan keluaran penyongsang tiga fasa boleh dikendalikan
Neuroevolution in Games: State of the Art and Open Challenges
This paper surveys research on applying neuroevolution (NE) to games. In
neuroevolution, artificial neural networks are trained through evolutionary
algorithms, taking inspiration from the way biological brains evolved. We
analyse the application of NE in games along five different axes, which are the
role NE is chosen to play in a game, the different types of neural networks
used, the way these networks are evolved, how the fitness is determined and
what type of input the network receives. The article also highlights important
open research challenges in the field.Comment: - Added more references - Corrected typos - Added an overview table
(Table 1
Reinforcement learning and its application to Othello
In this article we describe reinforcement learning, a machine learning technique for
solving sequential decision problems. We describe how reinforcement learning can
be combined with function approximation to get approximate solutions for problems
with very large state spaces.
One such problem is the board game Othello, with a state space size of approximately
1028. We apply reinforcement learning to this problem via a computer
program that learns a strategy (or policy) for Othello by playing against itself. The
reinforcement learning policy is evaluated against two standard strategies taken
from the literature with favorable results.
We contrast reinforcement learning with standard methods for solving sequential
decision problems and give some examples of applications of reinforcement learning
in operations research and management science from the literature
Evolutionary Algorithms for Reinforcement Learning
There are two distinct approaches to solving reinforcement learning problems,
namely, searching in value function space and searching in policy space.
Temporal difference methods and evolutionary algorithms are well-known examples
of these approaches. Kaelbling, Littman and Moore recently provided an
informative survey of temporal difference methods. This article focuses on the
application of evolutionary algorithms to the reinforcement learning problem,
emphasizing alternative policy representations, credit assignment methods, and
problem-specific genetic operators. Strengths and weaknesses of the
evolutionary approach to reinforcement learning are presented, along with a
survey of representative applications
Approximating n-player behavioural strategy nash equilibria using coevolution
Coevolutionary algorithms are plagued with a set of problems related to intransitivity that make it questionable what the end product of a coevolutionary run can achieve. With the introduction of solution concepts into coevolution, part of the issue was alleviated, however efficiently representing and achieving game theoretic solution concepts is still not a trivial task. In this paper we propose a coevolutionary algorithm that approximates behavioural strategy Nash equilibria in n-player zero sum games, by exploiting the minimax solution concept. In order to support our case we provide a set of experiments in both games of known and unknown equilibria. In the case of known equilibria, we can confirm our algorithm converges to the known solution, while in the case of unknown equilibria we can see a steady progress towards Nash. Copyright 2011 ACM
A Multi-Agent Approach to the Game of Go Using Genetic Algorithms
This is the published version. Copyright De GruyterMany researchers have written or attempted to write programs that play the ancient Chinese board game called Go. Although some programs play the game quite well compared with beginners, few play extremely well, and none of the best programs rely on soft computing artificial intelligence techniques like genetic algorithms or neural networks. This paper explores the advantages and possibilities of using genetic algorithms to evolve a multiagent Go player. We show that although individual agents may play poorly, collectively the agents working together play the game significantly better
- …