Robust satisfaction of temporal logic specifications via reinforcement learning

Aksaray, Derya; Belta, Calin; Jones, Austin; Kong, Zhaodan; Schwager, Mac

research

Robust satisfaction of temporal logic specifications via reinforcement learning

Authors: Derya Aksaray
Calin Belta
Austin Jones
Zhaodan Kong
Mac Schwager
Publication date: 1 January 2015
Publisher

Abstract

We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally-layered task given as a signal temporal logic formula. We represent the system as a finite-memory Markov decision process with unknown transition probabilities and whose states are built from a partition of the state space. We present provably convergent reinforcement learning algorithms to maximize the probability of satisfying a given specification and to maximize the average expected robustness, i.e. a measure of how strongly the formula is satisfied. Robustness allows us to quantify progress towards satisfying a given specification. We demonstrate via a pair of robot navigation simulation case studies that, due to the quantification of progress towards satisfaction, reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness with a low number of training examples

Similar works

Full text

Available Versions

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/29609

Last time updated on 09/07/2019