21,923 research outputs found

    Probabilistic Plan Synthesis for Coupled Multi-Agent Systems

    Full text link
    This paper presents a fully automated procedure for controller synthesis for multi-agent systems under the presence of uncertainties. We model the motion of each of the NN agents in the environment as a Markov Decision Process (MDP) and we assign to each agent one individual high-level formula given in Probabilistic Computational Tree Logic (PCTL). Each agent may need to collaborate with other agents in order to achieve a task. The collaboration is imposed by sharing actions between the agents. We aim to design local control policies such that each agent satisfies its individual PCTL formula. The proposed algorithm builds on clustering the agents, MDP products construction and controller policies design. We show that our approach has better computational complexity than the centralized case, which traditionally suffers from very high computational demands.Comment: IFAC WC 2017, Toulouse, Franc

    Strategy Synthesis for Autonomous Agents Using PRISM

    Get PDF
    We present probabilistic models for autonomous agent search and retrieve missions derived from Simulink models for an Unmanned Aerial Vehicle (UAV) and show how probabilistic model checking and the probabilistic model checker PRISM can be used for optimal controller generation. We introduce a sequence of scenarios relevant to UAVs and other autonomous agents such as underwater and ground vehicles. For each scenario we demonstrate how it can be modelled using the PRISM language, give model checking statistics and present the synthesised optimal controllers. We conclude with a discussion of the limitations when using probabilistic model checking and PRISM in this context and what steps can be taken to overcome them. In addition, we consider how the controllers can be returned to the UAV and adapted for use on larger search areas

    Barrier Functions for Multiagent-POMDPs with DTL Specifications

    Get PDF
    Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfaction of high-level specifications in terms of linear distribution temporal logic (LDTL). To this end, we use sufficient and necessary conditions for the invariance of a given set based on discrete-time barrier functions (DTBFs) and formulate sufficient conditions for finite time DTBF to study finite time convergence to a set. We then show that different LDTL mission/safety specifications can be cast as a set of invariance or finite time reachability problems. We demonstrate that the proposed method for safety-shield synthesis can be implemented online by a sequence of one-step greedy algorithms. We demonstrate the efficacy of the proposed method using experiments involving a team of robots

    Q-learning for robust satisfaction of signal temporal logic specifications

    Full text link
    This paper addresses the problem of learning optimal policies for satisfying signal temporal logic (STL) specifications by agents with unknown stochastic dynamics. The system is modeled as a Markov decision process, in which the states represent partitions of a continuous space and the transition probabilities are unknown. We formulate two synthesis problems where the desired STL specification is enforced by maximizing the probability of satisfaction, and the expected robustness degree, that is, a measure quantifying the quality of satisfaction. We discuss that Q-learning is not directly applicable to these problems because, based on the quantitative semantics of STL, the probability of satisfaction and expected robustness degree are not in the standard objective form of Q-learning. To resolve this issue, we propose an approximation of STL synthesis problems that can be solved via Q-learning, and we derive some performance bounds for the policies obtained by the approximate approach. The performance of the proposed method is demonstrated via simulations

    Statistical analysis of chemical computational systems with MULTIVESTA and ALCHEMIST

    Get PDF
    The chemical-oriented approach is an emerging paradigm for programming the behaviour of densely distributed and context-aware devices (e.g. in ecosystems of displays tailored to crowd steering, or to obtain profile-based coordinated visualization). Typically, the evolution of such systems cannot be easily predicted, thus making of paramount importance the availability of techniques and tools supporting prior-to-deployment analysis. Exact analysis techniques do not scale well when the complexity of systems grows: as a consequence, approximated techniques based on simulation assumed a relevant role. This work presents a new simulation-based distributed tool addressing the statistical analysis of such a kind of systems, which has been obtained by chaining two existing tools: MultiVeStA and Alchemist. The former is a recently proposed lightweight tool which allows to enrich existing discrete event simulators with distributed statistical analysis capabilities, while the latter is an efficient simulator for chemical-oriented computational systems. The tool is validated against a crowd steering scenario, and insights on the performance are provided by discussing how these scale distributing the analysis tasks on a multi-core architecture

    Technical Report: Distribution Temporal Logic: Combining Correctness with Quality of Estimation

    Full text link
    We present a new temporal logic called Distribution Temporal Logic (DTL) defined over predicates of belief states and hidden states of partially observable systems. DTL can express properties involving uncertainty and likelihood that cannot be described by existing logics. A co-safe formulation of DTL is defined and algorithmic procedures are given for monitoring executions of a partially observable Markov decision process with respect to such formulae. A simulation case study of a rescue robotics application outlines our approach.Comment: More expanded version of "Distribution Temporal Logic: Combining Correctness with Quality of Estimation" to appear in IEEE CDC 201

    Simulation and statistical model-checking of logic-based multi-agent system models

    Get PDF
    This thesis presents SALMA (Simulation and Analysis of Logic-Based Multi- Agent Models), a new approach for simulation and statistical model checking of multi-agent system models. Statistical model checking is a relatively new branch of model-based approximative verification methods that help to overcome the well-known scalability problems of exact model checking. In contrast to existing solutions, SALMA specifies the mechanisms of the simulated system by means of logical axioms based upon the well-established situation calculus. Leveraging the resulting first-order logic structure of the system model, the simulation is coupled with a statistical model-checker that uses a first-order variant of time-bounded linear temporal logic (LTL) for describing properties. This is combined with a procedural and process-based language for describing agent behavior. Together, these parts create a very expressive framework for modeling and verification that allows direct fine-grained reasoning about the agents’ interaction with each other and with their (physical) environment. SALMA extends the classical situation calculus and linear temporal logic (LTL) with means to address the specific requirements of multi-agent simulation models. In particular, cyber-physical domains are considered where the agents interact with their physical environment. Among other things, the thesis describes a generic situation calculus axiomatization that encompasses sensing and information transfer in multi agent systems, for instance sensor measurements or inter-agent messages. The proposed model explicitly accounts for real-time constraints and stochastic effects that are inevitable in cyber-physical systems. In order to make SALMA’s statistical model checking facilities usable also for more complex problems, a mechanism for the efficient on-the-fly evaluation of first-order LTL properties was developed. In particular, the presented algorithm uses an interval-based representation of the formula evaluation state together with several other optimization techniques to avoid unnecessary computation. Altogether, the goal of this thesis was to create an approach for simulation and statistical model checking of multi-agent systems that builds upon well-proven logical and statistical foundations, but at the same time takes a pragmatic software engineering perspective that considers factors like usability, scalability, and extensibility. In fact, experience gained during several small to mid-sized experiments that are presented in this thesis suggest that the SALMA approach seems to be able to live up to these expectations.In dieser Dissertation wird SALMA (Simulation and Analysis of Logic-Based Multi-Agent Models) vorgestellt, ein im Rahmen dieser Arbeit entwickelter Ansatz für die Simulation und die statistische Modellprüfung (Model Checking) von Multiagentensystemen. Der Begriff „Statistisches Model Checking” beschreibt modellbasierte approximative Verifikationsmethoden, die insbesondere dazu eingesetzt werden können, um den unvermeidlichen Skalierbarkeitsproblemen von exakten Methoden zu entgehen. Im Gegensatz zu bisherigen AnsĂ€tzen werden in SALMA die Mechanismen des simulierten Systems mithilfe logischer Axiome beschrieben, die auf dem etablierten Situationskalkül aufbauen. Die dadurch entstehende prĂ€dikatenlogische Struktur des Systemmodells wird ausgenutzt um ein Model Checking Modul zu integrieren, das seinerseits eine prĂ€dikatenlogische Variante der linearen temporalen Logik (LTL) verwendet. In Kombination mit einer prozeduralen und prozessorientierten Sprache für die Beschreibung von Agentenverhalten entsteht eine ausdrucksstarke und flexible Plattform für die Modellierung und Verifikation von Multiagentensystemen. Sie ermöglicht eine direkte und feingranulare Beschreibung der Interaktionen sowohl zwischen Agenten als auch von Agenten mit ihrer (physischen) Umgebung. SALMA erweitert den klassischen Situationskalkül und die lineare temporale Logik (LTL) um Elemente und Konzepte, die auf die spezifischen Anforderungen bei der Simulation und Modellierung von Multiagentensystemen ausgelegt sind. Insbesondere werden cyber-physische Systeme (CPS) unterstützt, in denen Agenten mit ihrer physischen Umgebung interagieren. Unter anderem wird eine generische, auf dem Situationskalkül basierende, Axiomatisierung von Prozessen beschrieben, in denen Informationen innerhalb von Multiagentensystemen transferiert werden – beispielsweise in Form von Sensor- Messwerten oder Netzwerkpaketen. Dabei werden ausdrücklich die unvermeidbaren stochastischen Effekte und Echtzeitanforderungen in cyber-physischen Systemen berücksichtigt. Um statistisches Model Checking mit SALMA auch für komplexere Problemstellungen zu ermöglichen, wurde ein Mechanismus für die effiziente Auswertung von prĂ€dikatenlogischen LTL-Formeln entwickelt. Insbesondere beinhaltet der vorgestellte Algorithmus eine Intervall-basierte ReprĂ€sentation des Auswertungszustands, sowie einige andere OptimierungsansĂ€tze zur Vermeidung von unnötigen Berechnungsschritten. Insgesamt war es das Ziel dieser Dissertation, eine Lösung für Simulation und statistisches Model Checking zu schaffen, die einerseits auf fundierten logischen und statistischen Grundlagen aufbaut, auf der anderen Seite jedoch auch pragmatischen Gesichtspunkten wie Benutzbarkeit oder Erweiterbarkeit genügt. TatsĂ€chlich legen erste Ergebnisse und Erfahrungen aus mehreren kleinen bis mittelgroßen Experimenten nahe, dass SALMA diesen Zielen gerecht wird

    Certified Reinforcement Learning with Logic Guidance

    Full text link
    This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
    • 

    corecore