2,624 research outputs found

    Investigating learning rates for evolution and temporal difference learning

    Get PDF
    Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE

    Temporal difference learning with interpolated table value functions

    Get PDF
    This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE

    The Layered Learning Method and its Application to Generation of Evaluation Functions for the Game

    Get PDF
    Abstract. In this paper we describe and analyze a Computational Intelligence (CI)-based approach to creating evaluation functions for two player mind games (i.e. classical turn-based board games that require mental skills, such as chess, checkers, Go, Othello, etc.). The method allows gradual, step-by-step training, starting with end-game positions and gradually moving towards the root of the game tree. In each phase a new training set is generated basing on results of previous training stages and any supervised learning method can be used for actual development of the evaluation function. We validate the usefulness of the approach by employing it to develop heuristics for the game of checkers. Since in previous experiments we applied it to training evaluation functions encoded as linear combinations of game state statistics, this time we concentrate on development of artificial neural network (ANN)-based heuristics. Games provide cheap, reproducible environments suitable for testing new search algorithms, pattern-based evaluation methods or learning concepts. Since the seminal papers devoted to programming chess [1-3] and checkers Most examples of application of CI methods to mind game playing make use of either reinforcement learning methods, neural networks-based approaches, evolutionary methods or hybrid neuro-genetic solutions, e.g. in chess The main focus of this paper is on testing the efficacy of what we call Layered Learning -a generally-applicable approach to building the evaluation function for twoplayer games (checkers in here) which can be implemented either in the evolutionary mode or as a gradient backpropagation-type neural network training. The method, originally proposed i

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Quoridor agent using Monte Carlo Tree Search

    Get PDF
    This thesis presents a preliminary study using Monte Carlo Tree Search (MCTS) upon the board game of Quoridor. The system is shown to perform well against current existing methods, defeating a set of player agents drawn from an existing digital implementation
    corecore