195,774 research outputs found
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a
computer-science perspective. It is written to be accessible to researchers
familiar with machine learning. Both the historical basis of the field and a
broad selection of current work are summarized. Reinforcement learning is the
problem faced by an agent that learns behavior through trial-and-error
interactions with a dynamic environment. The work described here has a
resemblance to work in psychology, but differs considerably in the details and
in the use of the word ``reinforcement.'' The paper discusses central issues of
reinforcement learning, including trading off exploration and exploitation,
establishing the foundations of the field via Markov decision theory, learning
from delayed reinforcement, constructing empirical models to accelerate
learning, making use of generalization and hierarchy, and coping with hidden
state. It concludes with a survey of some implemented systems and an assessment
of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
Agents for educational games and simulations
This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications
Critters in the Classroom: A 3D Computer-Game-Like Tool for Teaching Programming to Computer Animation Students
The brewing crisis threatening computer science education is a well documented fact. To counter this and to increase enrolment and retention in computer science related degrees, it has been suggested to make programming "more fun" and to offer "multidisciplinary and cross-disciplinary programs" [Carter 2006]. The Computer Visualisation and Animation undergraduate degree at the National Centre for Computer Animation (Bournemouth University) is such a programme. Computer programming forms an integral part of the curriculum of this technical arts degree, and as educators we constantly face the challenge of having to encourage our students to engage with the subject.
We intend to address this with our C-Sheep system, a reimagination of the "Karel the Robot" teaching tool [Pattis 1981], using modern 3D computer game graphics that today's students are familiar with. This provides a game-like setting for writing computer programs, using a task-specific set of instructions which allow users to take control of virtual entities acting within a micro world, effectively providing a graphical representation of the algorithms used. Whereas two decades ago, students would be intrigued by a 2D top-down representation of the micro world, the lack of the visual gimmickry found in modern computer games for representing the virtual world now makes it extremely difficult to maintain the interest of students from today's "Plug&Play generation". It is therefore especially important to aim for a 3D game-like representation which is "attractive and highly motivating to today's generation of media-conscious students" [Moskal et al. 2004].
Our system uses a modern, platform independent games engine, capable of presenting a visually rich virtual environment using a state of the art rendering engine of a type usually found in entertainment systems. Our aim is to entice students to spend more time programming, by providing them with an enjoyable experience.
This paper provides a discussion of the 3D computer game technology employed in our system and presents examples of how this can be exploited to provide engaging exercises to create a rewarding learning experience for our students
The Importance of Clipping in Neurocontrol by Direct Gradient Descent on the Cost-to-Go Function and in Adaptive Dynamic Programming
In adaptive dynamic programming, neurocontrol and reinforcement learning, the
objective is for an agent to learn to choose actions so as to minimise a total
cost function. In this paper we show that when discretized time is used to
model the motion of the agent, it can be very important to do "clipping" on the
motion of the agent in the final time step of the trajectory. By clipping we
mean that the final time step of the trajectory is to be truncated such that
the agent stops exactly at the first terminal state reached, and no distance
further. We demonstrate that when clipping is omitted, learning performance can
fail to reach the optimum; and when clipping is done properly, learning
performance can improve significantly.
The clipping problem we describe affects algorithms which use explicit
derivatives of the model functions of the environment to calculate a learning
gradient. These include Backpropagation Through Time for Control, and methods
based on Dual Heuristic Dynamic Programming. However the clipping problem does
not significantly affect methods based on Heuristic Dynamic Programming,
Temporal Differences or Policy Gradient Learning algorithms. Similarly, the
clipping problem does not affect fixed-length finite-horizon problems
- …