2 research outputs found

    Learning to Play Board Games using Temporal Difference Methods

    No full text
    A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: (1) Learning by self-play, (2) Learning by playing against an expert program, and (3) Learning from viewing experts play against themselves. Although the third possibility generates highquality games from the start compared to initial random games generated by self-play, the drawback is that the learning program is never allowed to test moves which it prefers. We compared these three methods using temporal difference methods to learn the game of backgammon. For particular games such as draughts and chess, learning from a large database containing games played by human experts has as a large advantage that during the generation of (useful) training games, no expensive lookahead planning is necessary for move selection. Experimental results in this paper show how useful this method is for learning to play chess and draughts

    Learning to play board games . . .

    No full text
    A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: (1) Learning by self-play, (2) Learning by playing against an expert program, and (3) Learning from viewing experts play against themselves. Although the third possibility generates highquality games from the start compared to initial random games generated by self-play, the drawback is that the learning program is never allowed to test moves which it prefers. We compared these three methods using temporal difference methods to learn the game of backgammon. For particular games such as draughts and chess, learning from a large database containing games played by human experts has as a large advantage that during the generation of (useful) training games, no expensive lookahead planning is necessary for move selection. Experimental results in this paper show how useful this method is for learning to play chess and draughts
    corecore