21,923 research outputs found
Probabilistic Plan Synthesis for Coupled Multi-Agent Systems
This paper presents a fully automated procedure for controller synthesis for
multi-agent systems under the presence of uncertainties. We model the motion of
each of the agents in the environment as a Markov Decision Process (MDP)
and we assign to each agent one individual high-level formula given in
Probabilistic Computational Tree Logic (PCTL). Each agent may need to
collaborate with other agents in order to achieve a task. The collaboration is
imposed by sharing actions between the agents. We aim to design local control
policies such that each agent satisfies its individual PCTL formula. The
proposed algorithm builds on clustering the agents, MDP products construction
and controller policies design. We show that our approach has better
computational complexity than the centralized case, which traditionally suffers
from very high computational demands.Comment: IFAC WC 2017, Toulouse, Franc
Strategy Synthesis for Autonomous Agents Using PRISM
We present probabilistic models for autonomous agent search and retrieve missions derived from Simulink models for an Unmanned Aerial Vehicle (UAV) and show how probabilistic model checking and the probabilistic model checker PRISM can be used for optimal controller generation. We introduce a sequence of scenarios relevant to UAVs and other autonomous agents such as underwater and ground vehicles. For each scenario we demonstrate how it can be modelled using the PRISM language, give model checking statistics and present the synthesised optimal controllers. We conclude with a discussion of the limitations when using probabilistic model checking and PRISM in this context and what steps can be taken to overcome them. In addition, we consider how the controllers can be returned to the UAV and adapted for use on larger search areas
Barrier Functions for Multiagent-POMDPs with DTL Specifications
Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfaction of high-level specifications in terms of linear distribution temporal logic (LDTL). To this end, we use sufficient and necessary conditions for the invariance of a given set based on discrete-time barrier functions (DTBFs) and formulate sufficient conditions for finite time DTBF to study finite time convergence to a set. We then show that different LDTL mission/safety specifications can be cast as a set of invariance or finite time reachability problems. We demonstrate that the proposed method for safety-shield synthesis can be implemented online by a sequence of one-step greedy algorithms. We demonstrate the efficacy of the proposed method using experiments involving a team of robots
Q-learning for robust satisfaction of signal temporal logic specifications
This paper addresses the problem of learning optimal policies for satisfying signal temporal logic (STL) specifications by agents with unknown stochastic dynamics. The system is modeled as a Markov decision process, in which the states represent partitions of a continuous space and the transition probabilities are unknown. We formulate two synthesis problems where the desired STL specification is enforced by maximizing the probability of satisfaction, and the expected robustness degree, that is, a measure quantifying the quality of satisfaction. We discuss that Q-learning is not directly applicable to these problems because, based on the quantitative semantics of STL, the probability of satisfaction and expected robustness degree are not in the standard objective form of Q-learning. To resolve this issue, we propose an approximation of STL synthesis problems that can be solved via Q-learning, and we derive some performance bounds for the policies obtained by the approximate approach. The performance of the proposed method is demonstrated via simulations
Statistical analysis of chemical computational systems with MULTIVESTA and ALCHEMIST
The chemical-oriented approach is an emerging paradigm for programming the behaviour of densely distributed and context-aware devices (e.g. in ecosystems of displays tailored to crowd steering, or to obtain profile-based coordinated visualization). Typically, the evolution of such systems cannot be easily predicted, thus making of paramount importance the availability of techniques and tools supporting prior-to-deployment analysis. Exact analysis techniques do not scale well when the complexity of systems grows: as a consequence, approximated techniques based on simulation assumed a relevant role. This work presents a new simulation-based distributed tool addressing the statistical analysis of such a kind of systems, which has been obtained by chaining two existing tools: MultiVeStA and Alchemist. The former is a recently proposed lightweight tool which allows to enrich existing discrete event simulators with distributed statistical analysis capabilities, while the latter is an efficient simulator for chemical-oriented computational systems. The tool is validated against a crowd steering scenario, and insights on the performance are provided by discussing how these scale distributing the analysis tasks on a multi-core architecture
Technical Report: Distribution Temporal Logic: Combining Correctness with Quality of Estimation
We present a new temporal logic called Distribution Temporal Logic (DTL)
defined over predicates of belief states and hidden states of partially
observable systems. DTL can express properties involving uncertainty and
likelihood that cannot be described by existing logics. A co-safe formulation
of DTL is defined and algorithmic procedures are given for monitoring
executions of a partially observable Markov decision process with respect to
such formulae. A simulation case study of a rescue robotics application
outlines our approach.Comment: More expanded version of "Distribution Temporal Logic: Combining
Correctness with Quality of Estimation" to appear in IEEE CDC 201
Simulation and statistical model-checking of logic-based multi-agent system models
This thesis presents SALMA (Simulation and Analysis of Logic-Based Multi-
Agent Models), a new approach for simulation and statistical model checking
of multi-agent system models.
Statistical model checking is a relatively new branch of model-based approximative
verification methods that help to overcome the well-known scalability
problems of exact model checking. In contrast to existing solutions,
SALMA specifies the mechanisms of the simulated system by means of logical
axioms based upon the well-established situation calculus. Leveraging
the resulting first-order logic structure of the system model, the simulation
is coupled with a statistical model-checker that uses a first-order variant of
time-bounded linear temporal logic (LTL) for describing properties. This is
combined with a procedural and process-based language for describing agent
behavior. Together, these parts create a very expressive framework for modeling
and verification that allows direct fine-grained reasoning about the agentsâ
interaction with each other and with their (physical) environment.
SALMA extends the classical situation calculus and linear temporal logic
(LTL) with means to address the specific requirements of multi-agent simulation
models. In particular, cyber-physical domains are considered where
the agents interact with their physical environment. Among other things,
the thesis describes a generic situation calculus axiomatization that encompasses
sensing and information transfer in multi agent systems, for instance
sensor measurements or inter-agent messages. The proposed model explicitly
accounts for real-time constraints and stochastic effects that are inevitable in
cyber-physical systems.
In order to make SALMAâs statistical model checking facilities usable also
for more complex problems, a mechanism for the efficient on-the-fly evaluation
of first-order LTL properties was developed. In particular, the presented algorithm
uses an interval-based representation of the formula evaluation state
together with several other optimization techniques to avoid unnecessary computation.
Altogether, the goal of this thesis was to create an approach for simulation
and statistical model checking of multi-agent systems that builds upon
well-proven logical and statistical foundations, but at the same time takes a
pragmatic software engineering perspective that considers factors like usability,
scalability, and extensibility. In fact, experience gained during several small
to mid-sized experiments that are presented in this thesis suggest that the
SALMA approach seems to be able to live up to these expectations.In dieser Dissertation wird SALMA (Simulation and Analysis of Logic-Based
Multi-Agent Models) vorgestellt, ein im Rahmen dieser Arbeit entwickelter
Ansatz fuÌr die Simulation und die statistische ModellpruÌfung (Model Checking)
von Multiagentensystemen.
Der Begriff âStatistisches Model Checkingâ beschreibt modellbasierte approximative
Verifikationsmethoden, die insbesondere dazu eingesetzt werden
können, um den unvermeidlichen Skalierbarkeitsproblemen von exakten Methoden
zu entgehen. Im Gegensatz zu bisherigen AnsÀtzen werden in SALMA die
Mechanismen des simulierten Systems mithilfe logischer Axiome beschrieben,
die auf dem etablierten SituationskalkuÌl aufbauen. Die dadurch entstehende
prÀdikatenlogische Struktur des Systemmodells wird ausgenutzt um ein Model
Checking Modul zu integrieren, das seinerseits eine prÀdikatenlogische Variante
der linearen temporalen Logik (LTL) verwendet. In Kombination mit
einer prozeduralen und prozessorientierten Sprache fuÌr die Beschreibung von
Agentenverhalten entsteht eine ausdrucksstarke und flexible Plattform fuÌr die
Modellierung und Verifikation von Multiagentensystemen. Sie ermöglicht eine
direkte und feingranulare Beschreibung der Interaktionen sowohl zwischen
Agenten als auch von Agenten mit ihrer (physischen) Umgebung.
SALMA erweitert den klassischen SituationskalkuÌl und die lineare temporale
Logik (LTL) um Elemente und Konzepte, die auf die spezifischen Anforderungen
bei der Simulation und Modellierung von Multiagentensystemen
ausgelegt sind. Insbesondere werden cyber-physische Systeme (CPS) unterstuÌtzt,
in denen Agenten mit ihrer physischen Umgebung interagieren. Unter
anderem wird eine generische, auf dem SituationskalkuÌl basierende, Axiomatisierung
von Prozessen beschrieben, in denen Informationen innerhalb von
Multiagentensystemen transferiert werden â beispielsweise in Form von Sensor-
Messwerten oder Netzwerkpaketen. Dabei werden ausdruÌcklich die unvermeidbaren
stochastischen Effekte und Echtzeitanforderungen in cyber-physischen
Systemen beruÌcksichtigt.
Um statistisches Model Checking mit SALMA auch fuÌr komplexere Problemstellungen
zu ermöglichen, wurde ein Mechanismus fuÌr die effiziente Auswertung
von prÀdikatenlogischen LTL-Formeln entwickelt. Insbesondere beinhaltet
der vorgestellte Algorithmus eine Intervall-basierte ReprÀsentation des
Auswertungszustands, sowie einige andere OptimierungsansÀtze zur Vermeidung
von unnötigen Berechnungsschritten.
Insgesamt war es das Ziel dieser Dissertation, eine Lösung fuÌr Simulation
und statistisches Model Checking zu schaffen, die einerseits auf fundierten
logischen und statistischen Grundlagen aufbaut, auf der anderen Seite jedoch
auch pragmatischen Gesichtspunkten wie Benutzbarkeit oder Erweiterbarkeit
genuÌgt. TatsĂ€chlich legen erste Ergebnisse und Erfahrungen aus
mehreren kleinen bis mittelgroĂen Experimenten nahe, dass SALMA diesen
Zielen gerecht wird
Certified Reinforcement Learning with Logic Guidance
This paper proposes the first model-free Reinforcement Learning (RL)
framework to synthesise policies for unknown, and continuous-state Markov
Decision Processes (MDPs), such that a given linear temporal property is
satisfied. We convert the given property into a Limit Deterministic Buchi
Automaton (LDBA), namely a finite-state machine expressing the property.
Exploiting the structure of the LDBA, we shape a synchronous reward function
on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces
that probabilistically satisfy the linear temporal property. This probability
(certificate) is also calculated in parallel with policy learning when the
state space of the MDP is finite: as such, the RL algorithm produces a policy
that is certified with respect to the property. Under the assumption of finite
state space, theoretical guarantees are provided on the convergence of the RL
algorithm to an optimal policy, maximising the above probability. We also show
that our method produces ''best available'' control policies when the logical
property cannot be satisfied. In the general case of a continuous state space,
we propose a neural network architecture for RL and we empirically show that
the algorithm finds satisfying policies, if there exist such policies. The
performance of the proposed framework is evaluated via a set of numerical
examples and benchmarks, where we observe an improvement of one order of
magnitude in the number of iterations required for the policy synthesis,
compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
- âŠ