54 research outputs found
Robust policy iteration for continuous-time stochastic control problem with unknown dynamics
In this article, we study a continuous-time stochastic control
problem based on reinforcement learning (RL) techniques that can be viewed as
solving a stochastic linear-quadratic two-person zero-sum differential game
(LQZSG). First, we propose an RL algorithm that can iteratively solve
stochastic game algebraic Riccati equation based on collected state and control
data when all dynamic system information is unknown. In addition, the algorithm
only needs to collect data once during the iteration process. Then, we discuss
the robustness and convergence of the inner and outer loops of the policy
iteration algorithm, respectively, and show that when the error of each
iteration is within a certain range, the algorithm can converge to a small
neighborhood of the saddle point of the stochastic LQZSG problem. Finally, we
applied the proposed RL algorithm to two simulation examples to verify the
effectiveness of the algorithm
- …