Search CORE

12,837 research outputs found

Adaptive Critic Design in Learning to Play Game of Go

Author: Prokhorov Danil V.
Wunsch Donald C.
Zaman R.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1997
Field of study

This paper examines the performance of an HDP-type adaptive critic design (ACD) of the game Go. The game Go is an ideal problem domain for exploring machine learning; it has simple rules but requires complex strategies to play well. All current commercial Go programs are knowledge based implementations; they utilize input feature and pattern matching along with minimax type search techniques. But the extremely high branching factor puts a limit on their capabilities, and they are very weak compared to the relative strengths of other game programs like chess. In this paper, the Go-playing ACD consists of a critic network and an action network. The HDP type critic network learns to predict the cumulative utility function of the current board position from training games, and, the action network chooses a next move which maximizes critics next step cost-to-go. After about 6000 different training games against a public domain program, WALLY, the network (playing WHITE) began to win in some of the games and showed slow but steady improvements on test game

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Deep learning for video game playing

Author: Bontrager Philip
Justesen Niels
Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Recommended from our members

Towards Informed Exploration for Deep Reinforcement Learning

Author: Tang Haoran
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact

eScholarship - University of California

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX