1 research outputs found
Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration
We are motivated by the real challenges presented in a human-robot system to
develop new designs that are efficient at data level and with performance
guarantees such as stability and optimality at systems level. Existing
approximate/adaptive dynamic programming (ADP) results that consider system
performance theoretically are not readily providing practically useful learning
control algorithms for this problem; and reinforcement learning (RL) algorithms
that address the issue of data efficiency usually do not have performance
guarantees for the controlled system. This study fills these important voids by
introducing innovative features to the policy iteration algorithm. We introduce
flexible policy iteration (FPI), which can flexibly and organically integrate
experience replay and supplemental values from prior experience into the RL
controller. We show system level performances including convergence of the
approximate value function, (sub)optimality of the solution, and stability of
the system. We demonstrate the effectiveness of the FPI via realistic
simulations of the human-robot system. It is noted that the problem we face in
this study may be difficult to address by design methods based on classical
control theory as it is nearly impossible to obtain a customized mathematical
model of a human-robot system either online or offline. The results we have
obtained also indicate the great potential of RL control to solving realistic
and challenging problems with high dimensional control inputs