Search CORE

1 research outputs found

Model-Free Temporal Difference Learning for Non-Zero-Sum Games

Author: DIng Dawei
Guo Zhishan
Wang Liming
Wunsch Donald C.
Yang Yongliang
Yin Yixin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2019
Field of study

In this paper, we consider the two-player nonzero-sum games problem for continuous-time linear dynamic systems. It is shown that the non-zero-sum games problem results in solving the coupled algebraic Riccati equations, which are nonlinear algebraic matrix equations. Compared with the algebraic Riccati equation of the linear dynamic systems with only one player, the coupled algebraic Riccati equations of nonzero-sum games with multi-player are more difficult to be solved directly. First, the policy iteration algorithm is introduced to find the Nash equilibrium of the non-zero-sum games, which is the sufficient and necessary condition to solve the coupled algebraic Riccati equations. However, the policy iteration algorithm is offline and requires complete knowledge of the system dynamics. To overcome the above issues, a novel online iterative algorithm, named integral temporal difference learning algorithm, is developed. Moreover, an equivalent compact form of the integral temporal difference learning algorithm is also presented. It is shown that the integral temporal difference learning algorithm can be implemented in an online fashion and requires only partial knowledge of the system dynamics. In addition, in each iteration step, the closed-loop stability using the integral temporal difference learning algorithm is analyzed. Finally, the simulation study shows the effectiveness of the presented algorithm

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine