8,313 research outputs found
Stable linear approximations to dynamic programming for stochastic control problems with local transitions
Caption title.Includes bibliographical references (leaf [7]).Supported by NSF. ECS 9216531 Supported by EPRI. 8030-10 Supported by the ARO.Benjamin Van Roy and John N. Tsitsiklis
Deterministic continutation of stochastic metastable equilibria via Lyapunov equations and ellipsoids
Numerical continuation methods for deterministic dynamical systems have been
one of the most successful tools in applied dynamical systems theory.
Continuation techniques have been employed in all branches of the natural
sciences as well as in engineering to analyze ordinary, partial and delay
differential equations. Here we show that the deterministic continuation
algorithm for equilibrium points can be extended to track information about
metastable equilibrium points of stochastic differential equations (SDEs). We
stress that we do not develop a new technical tool but that we combine results
and methods from probability theory, dynamical systems, numerical analysis,
optimization and control theory into an algorithm that augments classical
equilibrium continuation methods. In particular, we use ellipsoids defining
regions of high concentration of sample paths. It is shown that these
ellipsoids and the distances between them can be efficiently calculated using
iterative methods that take advantage of the numerical continuation framework.
We apply our method to a bistable neural competition model and a classical
predator-prey system. Furthermore, we show how global assumptions on the flow
can be incorporated - if they are available - by relating numerical
continuation, Kramers' formula and Rayleigh iteration.Comment: 29 pages, 7 figures [Fig.7 reduced in quality due to arXiv size
restrictions]; v2 - added Section 9 on Kramers' formula, additional
computations, corrected typos, improved explanation
The non-locality of Markov chain approximations to two-dimensional diffusions
In this short paper, we consider discrete-time Markov chains on lattices as
approximations to continuous-time diffusion processes. The approximations can
be interpreted as finite difference schemes for the generator of the process.
We derive conditions on the diffusion coefficients which permit transition
probabilities to match locally first and second moments. We derive a novel
formula which expresses how the matching becomes more difficult for larger
(absolute) correlations and strongly anisotropic processes, such that
instantaneous moves to more distant neighbours on the lattice have to be
allowed. Roughly speaking, for non-zero correlations, the distance covered in
one timestep is proportional to the ratio of volatilities in the two
directions. We discuss the implications to Markov decision processes and the
convergence analysis of approximations to Hamilton-Jacobi-Bellman equations in
the Barles-Souganidis framework.Comment: Corrected two errata from previous and journal version: definition of
R in (5) and summations in (7
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a
computer-science perspective. It is written to be accessible to researchers
familiar with machine learning. Both the historical basis of the field and a
broad selection of current work are summarized. Reinforcement learning is the
problem faced by an agent that learns behavior through trial-and-error
interactions with a dynamic environment. The work described here has a
resemblance to work in psychology, but differs considerably in the details and
in the use of the word ``reinforcement.'' The paper discusses central issues of
reinforcement learning, including trading off exploration and exploitation,
establishing the foundations of the field via Markov decision theory, learning
from delayed reinforcement, constructing empirical models to accelerate
learning, making use of generalization and hierarchy, and coping with hidden
state. It concludes with a survey of some implemented systems and an assessment
of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
- …