151 research outputs found
Evolutionary Algorithms for Reinforcement Learning
There are two distinct approaches to solving reinforcement learning problems,
namely, searching in value function space and searching in policy space.
Temporal difference methods and evolutionary algorithms are well-known examples
of these approaches. Kaelbling, Littman and Moore recently provided an
informative survey of temporal difference methods. This article focuses on the
application of evolutionary algorithms to the reinforcement learning problem,
emphasizing alternative policy representations, credit assignment methods, and
problem-specific genetic operators. Strengths and weaknesses of the
evolutionary approach to reinforcement learning are presented, along with a
survey of representative applications
Neuroevolution in Games: State of the Art and Open Challenges
This paper surveys research on applying neuroevolution (NE) to games. In
neuroevolution, artificial neural networks are trained through evolutionary
algorithms, taking inspiration from the way biological brains evolved. We
analyse the application of NE in games along five different axes, which are the
role NE is chosen to play in a game, the different types of neural networks
used, the way these networks are evolved, how the fitness is determined and
what type of input the network receives. The article also highlights important
open research challenges in the field.Comment: - Added more references - Corrected typos - Added an overview table
(Table 1
An intelligent Othello player combining machine learning and game specific heuristics
Artificial intelligence applications in board games have been around as early as the 1950\u27s, and computer programs have been developed for games including Checkers, Chess, and Go with varying results. Although general game-tree search algorithms have been designed to work on games meeting certain requirements (e.g. zero-sum, two-player, perfect or imperfect information, etc.), the best results, however, come from combining these with specific knowledge of game strategies. In this MS thesis, we present an intelligent Othello game player that combines game-specific heuristics with machine learning techniques in move selection. Five game specific heuristics, namely corner detection, killer move detection, blocking, blacklisting, and pattern recognition have been proposed. Some of these heuristics can be generalized to fit other games by removing the Othello specific components and replacing them with specific knowledge of the target game. For machine learning techniques, the normal Minimax algorithm along with a custom variation is used as a base. Genetic algorithms and neural networks are applied to learn the static evaluation function. The five game specific techniques (or a subset of) are to be executed first and if no move is found, Minimax game tree search is performed. All techniques and several subsets of them have been tested against three deterministic agents, one non-deterministic agent, and three human players of varying skill levels. The results show that the combined Othello player performs better in general. We present the study results on the basis of four main metrics: performance (percentage of games won), speed, predictability of opponent, and usage situation
Online Adaptable Learning Rates for the Game Connect-4
Abstract-Learning board games by self-play has a long tradition in computational intelligence for games. Based on Tesauro's seminal success with TD-Gammon in 1994, many successful agents use temporal difference learning today. But in order to be successful with temporal difference learning on game tasks, often a careful selection of features and a large number of training games is necessary. Even for board games of moderate complexity like Connect-4, we found in previous work that a very rich initial feature set and several millions of game plays are required. In this work we investigate different approaches of online-adaptable learning rates like Incremental Delta Bar Delta (IDBD) or Temporal Coherence Learning (TCL) whether they have the potential to speed up learning for such a complex task. We propose a new variant of TCL with geometric step size changes. We compare those algorithms with several other state-of-the-art learning rate adaptation algorithms and perform a case study on the sensitivity with respect to their meta parameters. We show that in this set of learning algorithms those with geometric step size changes outperform those other algorithms with constant step size changes. Algorithms with nonlinear output functions are slightly better than linear ones. Algorithms with geometric step size changes learn faster by a factor of 4 as compared to previously published results on the task Connect-4
Do Artificial Reinforcement-Learning Agents Matter Morally?
Artificial reinforcement learning (RL) is a widely used technique in
artificial intelligence that provides a general method for training agents to
perform a wide variety of behaviours. RL as used in computer science has
striking parallels to reward and punishment learning in animal and human
brains. I argue that present-day artificial RL agents have a very small but
nonzero degree of ethical importance. This is particularly plausible for views
according to which sentience comes in degrees based on the abilities and
complexities of minds, but even binary views on consciousness should assign
nonzero probability to RL programs having morally relevant experiences. While
RL programs are not a top ethical priority today, they may become more
significant in the coming decades as RL is increasingly applied to industry,
robotics, video games, and other areas. I encourage scientists, philosophers,
and citizens to begin a conversation about our ethical duties to reduce the
harm that we inflict on powerless, voiceless RL agents.Comment: 37 page
USING COEVOLUTION IN COMPLEX DOMAINS
Genetic Algorithms is a computational model inspired by Darwin's theory of evolution. It has a broad range of applications from function optimization to solving robotic control problems. Coevolution is an extension of Genetic Algorithms in which more than one population is evolved at the same time. Coevolution can be done in two ways: cooperatively, in which populations jointly try to solve an evolutionary problem, or competitively. Coevolution has been shown to be useful in solving many problems, yet its application in complex domains still needs to be demonstrated.Robotic soccer is a complex domain that has a dynamic and noisy environment. Many Reinforcement Learning techniques have been applied to the robotic soccer domain, since it is a great test bed for many machine learning methods. However, the success of Reinforcement Learning methods has been limited due to the huge state space of the domain. Evolutionary Algorithms have also been used to tackle this domain; nevertheless, their application has been limited to a small subset of the domain, and no attempt has been shown to be successful in acting on solving the whole problem.This thesis will try to answer the question of whether coevolution can be applied successfully to complex domains. Three techniques are introduced to tackle the robotic soccer problem. First, an incremental learning algorithm is used to achieve a desirable performance of some soccer tasks. Second, a hierarchical coevolution paradigm is introduced to allow coevolution to scale up in solving the problem. Third, an orchestration mechanism is utilized to manage the learning processes
Recommended from our members
Bayesian opponent modeling in adversarial game environments.
This thesis investigates the use of Bayesian analysis upon an opponent¿s behaviour in order to determine the desired goals or strategy used by a given adversary. A terrain analysis approach utilising the A* algorithm is investigated, where a probability distribution between discrete behaviours of an opponent relative to a set of possible goals is generated. The Bayesian analysis of agent behaviour accurately determines the intended goal of an opponent agent, even when the opponent¿s actions are altered randomly. The environment of Poker is introduced and abstracted for ease of analysis. Bayes¿ theorem is used to generate an effective opponent model, categorizing behaviour according to its similarity with known styles of opponent. The accuracy of Bayes¿ rule yields a notable improvement in the performance of an agent once an opponent¿s style is understood. A hybrid of the Bayesian style predictor and a neuroevolutionary approach is shown to lead to effective dynamic play, in comparison to agents that do not use an opponent model. The use of recurrence in evolved networks is also shown to improve the performance and generalizability of an agent in a multiplayer environment. These strategies are then employed in the full-scale environment of Texas Hold¿em, where a betting round-based approach proves useful in determining and counteracting an opponent¿s play. It is shown that the use of opponent models, with the adaptive benefits of neuroevolution aid the performance of an agent, even when the behaviour of an opponent does not necessarily fit within the strict definitions of opponent ¿style¿.Engineering and Physical Sciences Research Council (EPSRC
- …