216 research outputs found
Stochastic recursive inclusion in two timescales with an application to the Lagrangian dual problem
In this paper we present a framework to analyze the asymptotic behavior of
two timescale stochastic approximation algorithms including those with
set-valued mean fields. This paper builds on the works of Borkar and Perkins &
Leslie. The framework presented herein is more general as compared to the
synchronous two timescale framework of Perkins \& Leslie, however the
assumptions involved are easily verifiable. As an application, we use this
framework to analyze the two timescale stochastic approximation algorithm
corresponding to the Lagrangian dual problem in optimization theory
Two Timescale Stochastic Approximation with Controlled Markov noise and Off-policy temporal difference learning
We present for the first time an asymptotic convergence analysis of two
time-scale stochastic approximation driven by `controlled' Markov noise. In
particular, both the faster and slower recursions have non-additive controlled
Markov noise components in addition to martingale difference noise. We analyze
the asymptotic behavior of our framework by relating it to limiting
differential inclusions in both time-scales that are defined in terms of the
ergodic occupation measures associated with the controlled Markov processes.
Finally, we present a solution to the off-policy convergence problem for
temporal difference learning with linear function approximation, using our
results.Comment: 23 pages (relaxed some important assumptions from the previous
version), accepted in Mathematics of Operations Research in Feb, 201
- …