6,484 research outputs found
Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation
This paper studies the robustness aspect of reinforcement learning algorithms
in the presence of errors. Specifically, we revisit the benchmark problem of
discrete-time linear quadratic regulation (LQR) and study the long-standing
open question: Under what conditions is the policy iteration method robustly
stable for dynamical systems with unbounded, continuous state and action
spaces? Using advanced stability results in control theory, it is shown that
policy iteration for LQR is inherently robust to small errors and enjoys local
input-to-state stability: whenever the error in each iteration is bounded and
small, the solutions of the policy iteration algorithm are also bounded, and,
moreover, enter and stay in a small neighborhood of the optimal LQR solution.
As an application, a novel off-policy optimistic least-squares policy iteration
for the LQR problem is proposed, when the system dynamics are subjected to
additive stochastic disturbances. The proposed new results in robust
reinforcement learning are validated by a numerical example.Comment: arXiv admin note: text overlap with arXiv:2005.0952
Real-Time Motion Planning of Legged Robots: A Model Predictive Control Approach
We introduce a real-time, constrained, nonlinear Model Predictive Control for
the motion planning of legged robots. The proposed approach uses a constrained
optimal control algorithm known as SLQ. We improve the efficiency of this
algorithm by introducing a multi-processing scheme for estimating value
function in its backward pass. This pass has been often calculated as a single
process. This parallel SLQ algorithm can optimize longer time horizons without
proportional increase in its computation time. Thus, our MPC algorithm can
generate optimized trajectories for the next few phases of the motion within
only a few milliseconds. This outperforms the state of the art by at least one
order of magnitude. The performance of the approach is validated on a quadruped
robot for generating dynamic gaits such as trotting.Comment: 8 page
- …