88 research outputs found
Technical Report: Distribution Temporal Logic: Combining Correctness with Quality of Estimation
We present a new temporal logic called Distribution Temporal Logic (DTL)
defined over predicates of belief states and hidden states of partially
observable systems. DTL can express properties involving uncertainty and
likelihood that cannot be described by existing logics. A co-safe formulation
of DTL is defined and algorithmic procedures are given for monitoring
executions of a partially observable Markov decision process with respect to
such formulae. A simulation case study of a rescue robotics application
outlines our approach.Comment: More expanded version of "Distribution Temporal Logic: Combining
Correctness with Quality of Estimation" to appear in IEEE CDC 201
Technical Report: A Receding Horizon Algorithm for Informative Path Planning with Temporal Logic Constraints
This technical report is an extended version of the paper 'A Receding Horizon
Algorithm for Informative Path Planning with Temporal Logic Constraints'
accepted to the 2013 IEEE International Conference on Robotics and Automation
(ICRA). This paper considers the problem of finding the most informative path
for a sensing robot under temporal logic constraints, a richer set of
constraints than have previously been considered in information gathering. An
algorithm for informative path planning is presented that leverages tools from
information theory and formal control synthesis, and is proven to give a path
that satisfies the given temporal logic constraints. The algorithm uses a
receding horizon approach in order to provide a reactive, on-line solution
while mitigating computational complexity. Statistics compiled from multiple
simulation studies indicate that this algorithm performs better than a baseline
exhaustive search approach.Comment: Extended version of paper accepted to 2013 IEEE International
Conference on Robotics and Automation (ICRA
Technical report: Distribution Temporal Logic: combining correctness with quality of estimation
We present a new temporal logic called Distribution Temporal Logic (DTL) defined over predicates of belief states and hidden states of partially observable systems. DTL can express properties involving uncertainty and likelihood that cannot be described by existing logics. A co-safe formulation of DTL is defined and algorithmic procedures are given for monitoring executions of a partially observable Markov decision process with respect to such formulae. A simulation case study of a rescue robotics application outlines our approach
Distributed Conjugate Gradient Method via Conjugate Direction Tracking
We present a distributed conjugate gradient method for distributed
optimization problems, where each agent computes an optimal solution of the
problem locally without any central computation or coordination, while
communicating with its immediate, one-hop neighbors over a communication
network. Each agent updates its local problem variable using an estimate of the
average conjugate direction across the network, computed via a dynamic
consensus approach. Our algorithm enables the agents to use uncoordinated
step-sizes. We prove convergence of the local variable of each agent to the
optimal solution of the aggregate optimization problem, without requiring
decreasing step-sizes. In addition, we demonstrate the efficacy of our
algorithm in distributed state estimation problems, and its robust
counterparts, where we show its performance compared to existing distributed
first-order optimization methods
Robust Satisfaction of Temporal Logic Specifications via Reinforcement Learning
We consider the problem of steering a system with unknown, stochastic
dynamics to satisfy a rich, temporally layered task given as a signal temporal
logic formula. We represent the system as a Markov decision process in which
the states are built from a partition of the state space and the transition
probabilities are unknown. We present provably convergent reinforcement
learning algorithms to maximize the probability of satisfying a given formula
and to maximize the average expected robustness, i.e., a measure of how
strongly the formula is satisfied. We demonstrate via a pair of robot
navigation simulation case studies that reinforcement learning with robustness
maximization performs better than probability maximization in terms of both
probability of satisfaction and expected robustness.Comment: 8 pages, 4 figure
Robust satisfaction of temporal logic specifications via reinforcement learning
We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally-layered task given as a signal temporal logic formula. We represent the system as a finite-memory Markov decision process with unknown transition probabilities and whose states are built from a partition of the state space. We present provably convergent reinforcement learning algorithms to maximize the probability of satisfying a given specification and to maximize the average expected robustness, i.e. a measure of how strongly the formula is satisfied. Robustness allows us to quantify progress towards satisfying a given specification. We demonstrate via a pair of robot navigation simulation case studies that, due to the quantification of progress towards satisfaction, reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness with a low number of training examples
Q-learning for robust satisfaction of signal temporal logic specifications
This paper addresses the problem of learning optimal policies for satisfying signal temporal logic (STL) specifications by agents with unknown stochastic dynamics. The system is modeled as a Markov decision process, in which the states represent partitions of a continuous space and the transition probabilities are unknown. We formulate two synthesis problems where the desired STL specification is enforced by maximizing the probability of satisfaction, and the expected robustness degree, that is, a measure quantifying the quality of satisfaction. We discuss that Q-learning is not directly applicable to these problems because, based on the quantitative semantics of STL, the probability of satisfaction and expected robustness degree are not in the standard objective form of Q-learning. To resolve this issue, we propose an approximation of STL synthesis problems that can be solved via Q-learning, and we derive some performance bounds for the policies obtained by the approximate approach. The performance of the proposed method is demonstrated via simulations
- …