Search CORE

18,089 research outputs found

Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes

Author: Abate Alessandro
Haesaert Sofie
Polgreen Elizabeth
Wijesuriya Viraj
Publication venue
Publication date: 01/01/2017
Field of study

We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system. We obtain the confidence that the underlying system satisfies a given property, and show that the method uses data efficiently and thus is robust to the amount of data available. These characteristics are achieved by firstly exploiting parameter synthesis to establish a feasible set of parameters for which the underlying system will satisfy the property; secondly, by actively synthesising experiments to increase amount of information in the collected data that is relevant to the property; and finally propagating this information over the model parameters, obtaining a confidence that reflects our belief whether or not the system parameters lie in the feasible set, thereby solving the verification problem.Comment: QEST 2017, 18 pages, 7 figure

arXiv.org e-Print Archive

Repository TU/e

Certified Reinforcement Learning with Logic Guidance

Author: Abate Alessandro
Hasanbeig Mohammadhosein
Kroening Daniel
Publication venue
Publication date: 10/02/2020
Field of study

This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782

arXiv.org e-Print Archive

Toward Specification-Guided Active Mars Exploration for Cooperative Robot Teams

Author: Agha-Mohammadi Ali-Akbar
Ames Aaron D.
Haesaert Sofie
Murray Richard M.
Nilsson Petter
Otsu Kyohei
Thakker Rohan
Vasile Cristian-Ioan
Publication venue: 'Robotics: Science and Systems Foundation'
Publication date: 01/06/2018
Field of study

As a step towards achieving autonomy in space exploration missions, we consider a cooperative robotics system consisting of a copter and a rover. The goal of the copter is to explore an unknown environment so as to maximize knowledge about a science mission expressed in linear temporal logic that is to be executed by the rover. We model environmental uncertainty as a belief space Markov decision process and formulate the problem as a two-step stochastic dynamic program that we solve in a way that leverages the decomposed nature of the overall system. We demonstrate in simulations that the robot team makes intelligent decisions in the face of uncertainty

Caltech Authors

Verification of Uncertain POMDPs Using Barrier Certificates

Author: Ahmadi Mohamadreza
Cubuktepe Murat
Jansen Nils
Topcu Ufuk
Publication venue
Publication date: 10/07/2018
Field of study

We consider a class of partially observable Markov decision processes (POMDPs) with uncertain transition and/or observation probabilities. The uncertainty takes the form of probability intervals. Such uncertain POMDPs can be used, for example, to model autonomous agents with sensors with limited accuracy, or agents undergoing a sudden component failure, or structural damage [1]. Given an uncertain POMDP representation of the autonomous agent, our goal is to propose a method for checking whether the system will satisfy an optimal performance, while not violating a safety requirement (e.g. fuel level, velocity, and etc.). To this end, we cast the POMDP problem into a switched system scenario. We then take advantage of this switched system characterization and propose a method based on barrier certificates for optimality and/or safety verification. We then show that the verification task can be carried out computationally by sum-of-squares programming. We illustrate the efficacy of our method by applying it to a Mars rover exploration example.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Computing Nash Equilibrium in Wireless Ad Hoc Networks: A Simulation-Based Approach

Author: A. B. Mackenzie
Abdorasoul Ghasemi
Alexandre David
Alexandre David
Alexandre David
Axel Legay
Axel Legay
Bernd Finkbeiner
Ehud Kalai
Eitan Altman
Franck Cassez
Gerd Behrmann
Hakan L. S. Younes
IEEE Computer Society
Irfan Zakiuddin
J. Neel
Johannes Dams
Johannes Reich
Kim G. Larsen
Kim G. Larsen
Levente Buttyan
Marius Mikučionis
Mark Felegyhazi
Norman Abramson
P Nuggehalli
Parosh Abdulla
Patricia Bouyer
Peter Bulychev
Rajeev Alur
V. Srivastava
ZigBee Alliance
Publication venue: 'Open Publishing Association'
Publication date: 01/02/2012
Field of study

This paper studies the problem of computing Nash equilibrium in wireless networks modeled by Weighted Timed Automata. Such formalism comes together with a logic that can be used to describe complex features such as timed energy constraints. Our contribution is a method for solving this problem using Statistical Model Checking. The method has been implemented in UPPAAL model checker and has been applied to the analysis of Aloha CSMA/CD and IEEE 802.15.4 CSMA/CA protocols.Comment: In Proceedings IWIGP 2012, arXiv:1202.422

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

VBN