Search CORE

13,157 research outputs found

A Framework for Worst-Case and Stochastic Safety Verification Using Barrier Certificates

Author: Jadbabaie Ali
Pappas George J.
Prajna Stephen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

This paper presents a methodology for safety verification of continuous and hybrid systems in the worst-case and stochastic settings. In the worst-case setting, a function of state termed barrier certificate is used to certify that all trajectories of the system starting from a given initial set do not enter an unsafe region. No explicit computation of reachable sets is required in the construction of barrier certificates, which makes it possible to handle nonlinearity, uncertainty, and constraints directly within this framework. In the stochastic setting, our method computes an upper bound on the probability that a trajectory of the system reaches the unsafe set, a bound whose validity is proven by the existence of a barrier certificate. For polynomial systems, barrier certificates can be constructed using convex optimization, and hence the method is computationally tractable. Some examples are provided to illustrate the use of the method

CiteSeerX

Caltech Authors

ScholarlyCommons@Penn

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 31/08/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref