195,774 research outputs found

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

    Agents for educational games and simulations

    Get PDF
    This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications

    Critters in the Classroom: A 3D Computer-Game-Like Tool for Teaching Programming to Computer Animation Students

    Get PDF
    The brewing crisis threatening computer science education is a well documented fact. To counter this and to increase enrolment and retention in computer science related degrees, it has been suggested to make programming "more fun" and to offer "multidisciplinary and cross-disciplinary programs" [Carter 2006]. The Computer Visualisation and Animation undergraduate degree at the National Centre for Computer Animation (Bournemouth University) is such a programme. Computer programming forms an integral part of the curriculum of this technical arts degree, and as educators we constantly face the challenge of having to encourage our students to engage with the subject. We intend to address this with our C-Sheep system, a reimagination of the "Karel the Robot" teaching tool [Pattis 1981], using modern 3D computer game graphics that today's students are familiar with. This provides a game-like setting for writing computer programs, using a task-specific set of instructions which allow users to take control of virtual entities acting within a micro world, effectively providing a graphical representation of the algorithms used. Whereas two decades ago, students would be intrigued by a 2D top-down representation of the micro world, the lack of the visual gimmickry found in modern computer games for representing the virtual world now makes it extremely difficult to maintain the interest of students from today's "Plug&Play generation". It is therefore especially important to aim for a 3D game-like representation which is "attractive and highly motivating to today's generation of media-conscious students" [Moskal et al. 2004]. Our system uses a modern, platform independent games engine, capable of presenting a visually rich virtual environment using a state of the art rendering engine of a type usually found in entertainment systems. Our aim is to entice students to spend more time programming, by providing them with an enjoyable experience. This paper provides a discussion of the 3D computer game technology employed in our system and presents examples of how this can be exploited to provide engaging exercises to create a rewarding learning experience for our students

    The Importance of Clipping in Neurocontrol by Direct Gradient Descent on the Cost-to-Go Function and in Adaptive Dynamic Programming

    Full text link
    In adaptive dynamic programming, neurocontrol and reinforcement learning, the objective is for an agent to learn to choose actions so as to minimise a total cost function. In this paper we show that when discretized time is used to model the motion of the agent, it can be very important to do "clipping" on the motion of the agent in the final time step of the trajectory. By clipping we mean that the final time step of the trajectory is to be truncated such that the agent stops exactly at the first terminal state reached, and no distance further. We demonstrate that when clipping is omitted, learning performance can fail to reach the optimum; and when clipping is done properly, learning performance can improve significantly. The clipping problem we describe affects algorithms which use explicit derivatives of the model functions of the environment to calculate a learning gradient. These include Backpropagation Through Time for Control, and methods based on Dual Heuristic Dynamic Programming. However the clipping problem does not significantly affect methods based on Heuristic Dynamic Programming, Temporal Differences or Policy Gradient Learning algorithms. Similarly, the clipping problem does not affect fixed-length finite-horizon problems
    corecore