Search CORE

17,960 research outputs found

Scalable Verification of Markov Decision Processes

Author: Legay Axel
Sedwards Sean
Traonouez Louis-Marie
Publication venue
Publication date: 02/09/2014
Field of study

Markov decision processes (MDP) are useful to model concurrent process optimisation problems, but verifying them with numerical methods is often intractable. Existing approximative approaches do not scale well and are limited to memoryless schedulers. Here we present the basis of scalable verification for MDPSs, using an O(1) memory representation of history-dependent schedulers. We thus facilitate scalable learning techniques and the use of massively parallel verification.Comment: V4: FMDS version, 12 pages, 4 figure

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Author: E Ohn-Bar
EM Hahn
G Katz
J Garcia
J Kemeny
M Kattenbelt
M Kwiatkowska
M Lahijania
MC Machado
R Ehlers
S Junges
SEZ Soudjani
T Brázdil
V Mnih
X Huang
Publication venue
Publication date: 29/06/2020
Field of study

Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study

Author: A Bauer
A Bianco
A Itai
A Mizera
C Baier
C Higuera De la
C Kermorvant
C Rohr
D Angluin
D Ron
D Tabakov
EM Clarke
EM Clarke
F He
G Norman
G Norman
HL Younes
HLS Younes
HLS Younes
I Shmulevich
JH Holland
K Havelund
K Sen
L Helmink
M Kwiatkowska
MK Reiter
RC Carrasco
RC Carrasco
T Brázdil
T Herman
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? In this work, we investigate existing algorithms for learning probabilistic models for model checking, propose an evolution-based approach for better controlling the degree of generalization and conduct an empirical study in order to answer the questions. One of our findings is that the effectiveness of learning may sometimes be limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Open Repository and Bibliography - Luxembourg

Learning Markov Decision Processes for Model Checking

Author: Chen Yingke
Jaeger Manfred
Larsen Kim G.
Mao Hua
Nielsen Brian
Nielsen Thomas D.
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2012
Field of study

Constructing an accurate system model for formal model verification can be both resource demanding and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically learning system models based on observed system behaviors. In this paper we extend the algorithm on learning probabilistic automata to reactive systems, where the observed system behavior is in the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically learning a deterministic labeled Markov decision process model from the observed behavior of a reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions. The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned models.Comment: In Proceedings QFM 2012, arXiv:1212.345

arXiv.org e-Print Archive

Directory of Open Access Journals

VBN

Stochastic Shortest Path with Energy Constraints in POMDPs

Author: Brázdil Tomáš
Chatterjee Krishnendu
Chmelík Martin
Gupta Anchit
Novotný Petr
Publication venue
Publication date: 01/01/2016
Field of study

We consider partially observable Markov decision processes (POMDPs) with a set of target states and positive integer costs associated with every transition. The traditional optimization objective (stochastic shortest path) asks to minimize the expected total cost until the target set is reached. We extend the traditional framework of POMDPs to model energy consumption, which represents a hard constraint. The energy levels may increase and decrease with transitions, and the hard constraint requires that the energy level must remain positive in all steps till the target is reached. First, we present a novel algorithm for solving POMDPs with energy levels, developing on existing POMDP solvers and using RTDP as its main method. Our second contribution is related to policy representation. For larger POMDP instances the policies computed by existing solvers are too large to be understandable. We present an automated procedure based on machine learning techniques that automatically extracts important decisions of the policy allowing us to compute succinct human readable policies. Finally, we show experimentally that our algorithm performs well and computes succinct policies on a number of POMDP instances from the literature that were naturally enhanced with energy levels.Comment: Technical report accompanying a paper published in proceedings of AAMAS 201

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Smart Sampling for Lightweight Verification of Markov Decision Processes

Author: D'Argenio Pedro
Legay Axel
Sedwards Sean
Traonouez Louis-Marie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Markov decision processes (MDP) are useful to model optimisation problems in concurrent systems. To verify MDPs with efficient Monte Carlo techniques requires that their nondeterminism be resolved by a scheduler. Recent work has introduced the elements of lightweight techniques to sample directly from scheduler space, but finding optimal schedulers by simple sampling may be inefficient. Here we describe "smart" sampling algorithms that can make substantial improvements in performance.Comment: IEEE conference style, 11 pages, 5 algorithms, 11 figures, 1 tabl

arXiv.org e-Print Archive

HAL-CentraleSupelec

CONICET Digital

INRIA a CCSD electronic archive server

Repositorio Digital de la Universidad Nacional de Córdoba

HAL-Rennes 1