Search CORE

8,313 research outputs found

Stable linear approximations to dynamic programming for stochastic control problems with local transitions

Author
Publication venue: Massachusetts Institute of Technology, Laboratory for Information and Decision Systems]
Publication date: 01/01/1996
Field of study

Caption title.Includes bibliographical references (leaf [7]).Supported by NSF. ECS 9216531 Supported by EPRI. 8030-10 Supported by the ARO.Benjamin Van Roy and John N. Tsitsiklis

DSpace@MIT

Deterministic continutation of stochastic metastable equilibria via Lyapunov equations and ellipsoids

Author: Arrhenius S.
Christian Kuehn
Dellnitz M.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 09/01/2012
Field of study

Numerical continuation methods for deterministic dynamical systems have been one of the most successful tools in applied dynamical systems theory. Continuation techniques have been employed in all branches of the natural sciences as well as in engineering to analyze ordinary, partial and delay differential equations. Here we show that the deterministic continuation algorithm for equilibrium points can be extended to track information about metastable equilibrium points of stochastic differential equations (SDEs). We stress that we do not develop a new technical tool but that we combine results and methods from probability theory, dynamical systems, numerical analysis, optimization and control theory into an algorithm that augments classical equilibrium continuation methods. In particular, we use ellipsoids defining regions of high concentration of sample paths. It is shown that these ellipsoids and the distances between them can be efficiently calculated using iterative methods that take advantage of the numerical continuation framework. We apply our method to a bistable neural competition model and a classical predator-prey system. Furthermore, we show how global assumptions on the flow can be incorporated - if they are available - by relating numerical continuation, Kramers' formula and Rayleigh iteration.Comment: 29 pages, 7 figures [Fig.7 reduced in quality due to arXiv size restrictions]; v2 - added Section 9 on Kramers' formula, additional computations, corrected typos, improved explanation

arXiv.org e-Print Archive

Crossref

The non-locality of Markov chain approximations to two-dimensional diffusions

Author: Reisinger Christoph
Publication venue
Publication date: 01/01/2016
Field of study

In this short paper, we consider discrete-time Markov chains on lattices as approximations to continuous-time diffusion processes. The approximations can be interpreted as finite difference schemes for the generator of the process. We derive conditions on the diffusion coefficients which permit transition probabilities to match locally first and second moments. We derive a novel formula which expresses how the matching becomes more difficult for larger (absolute) correlations and strongly anisotropic processes, such that instantaneous moves to more distant neighbours on the lattice have to be allowed. Roughly speaking, for non-zero correlations, the distance covered in one timestep is proportional to the ratio of volatilities in the two directions. We discuss the implications to Markov decision processes and the convergence analysis of approximations to Hamilton-Jacobi-Bellman equations in the Barles-Souganidis framework.Comment: Corrected two errata from previous and journal version: definition of R in (5) and summations in (7

arXiv.org e-Print Archive

Oxford University Research Archive

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX