1,355 research outputs found

    Certified Reinforcement Learning with Logic Guidance

    Full text link
    This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782

    Estimation and control of non-linear and hybrid systems with applications to air-to-air guidance

    Get PDF
    Issued as Progress report, and Final report, Project no. E-21-67

    Deep Learning for Abstraction, Control and Monitoring of Complex Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPS) consist of digital devices that interact with some physical components. Their popularity and complexity are growing exponentially, giving birth to new, previously unexplored, safety-critical application domains. As CPS permeate our daily lives, it becomes imperative to reason about their reliability. Formal methods provide rigorous techniques for verification, control and synthesis of safe and reliable CPS. However, these methods do not scale with the complexity of the system, thus their applicability to real-world problems is limited. A promising strategy is to leverage deep learning techniques to tackle the scalability issue of formal methods, transforming unfeasible problems into approximately solvable ones. The approximate models are trained over observations which are solutions of the formal problem. In this thesis, we focus on the following tasks, which are computationally challenging: the modeling and the simulation of a complex stochastic model, the design of a safe and robust control policy for a system acting in a highly uncertain environment and the runtime verification problem under full or partial observability. Our approaches, based on deep learning, are indeed applicable to real-world complex and safety-critical systems acting under strict real-time constraints and in presence of a significant amount of uncertainty.Cyber-Physical Systems (CPS) consist of digital devices that interact with some physical components. Their popularity and complexity are growing exponentially, giving birth to new, previously unexplored, safety-critical application domains. As CPS permeate our daily lives, it becomes imperative to reason about their reliability. Formal methods provide rigorous techniques for verification, control and synthesis of safe and reliable CPS. However, these methods do not scale with the complexity of the system, thus their applicability to real-world problems is limited. A promising strategy is to leverage deep learning techniques to tackle the scalability issue of formal methods, transforming unfeasible problems into approximately solvable ones. The approximate models are trained over observations which are solutions of the formal problem. In this thesis, we focus on the following tasks, which are computationally challenging: the modeling and the simulation of a complex stochastic model, the design of a safe and robust control policy for a system acting in a highly uncertain environment and the runtime verification problem under full or partial observability. Our approaches, based on deep learning, are indeed applicable to real-world complex and safety-critical systems acting under strict real-time constraints and in presence of a significant amount of uncertainty

    Reinforcing Reachable Routes

    Get PDF
    This paper studies the evaluation of routing algorithms from the perspective of reachability routing, where the goal is to determine all paths between a sender and a receiver. Reachability routing is becoming relevant with the changing dynamics of the Internet and the emergence of low-bandwidth wireless/ad-hoc networks. We make the case for reinforcement learning as the framework of choice to realize reachability routing, within the confines of the current Internet infrastructure. The setting of the reinforcement learning problem offers several advantages,including loop resolution, multi-path forwarding capability, cost-sensitive routing, and minimizing state overhead, while maintaining the incremental spirit of current backbone routing algorithms. We identify research issues in reinforcement learning applied to the reachability routing problem to achieve a fluid and robust backbone routing framework. This paper also presents the design, implementation and evaluation of a new reachability routing algorithm that uses a model-based approach to achieve cost-sensitive multi-path forwarding; performance assessment of the algorithm in various troublesome topologies shows consistently superior performance over classical reinforcement learning algorithms. The paper is targeted toward practitioners seeking to implement a reachability routing algorithm

    Time and Cost Optimization of Cyber-Physical Systems by Distributed Reachability Analysis

    Get PDF

    Simulation and numerical solution of stochastic Petri nets with discrete and continuous timing

    Get PDF
    We introduce a novel stochastic Petri net formalism where discrete and continuous phase-type firing delays can appear in the same model. By capturing deterministic and generally random behavior in discrete or continuous time, as appropriate, the formalism affords higher modeling fidelity and efficiencies to use in practice. We formally specify the underlying stochastic process as a general state space Markov chain and show that it is regenerative, thus amenable to renewal theory techniques to obtain steady-state solutions. We present two steady-state analysis methods depending on the class of problem: one using exact numerical techniques, the other using simulation. Although regenerative structures that ease steady-state analysis exist in general, a noteworthy problem class arises when discrete-time transitions are synchronized. In this case, the underlying process is semi-regenerative and we can employ Markov renewal theory to formulate exact and efficient numerical solutions for the stationary distribution. We propose a solution method that shows promise in terms of time and space efficiency. Also noteworthy are the computational tradeoffs when analyzing the embedded versus the subordinate Markov chains that are hidden within the original process. In the absence of simplifying assumptions, we propose an efficient regenerative simulation method that identifies hidden regenerative structures within continuous state spaces. The new formalism and solution methods are demonstrated with two applications

    On the analysis of stochastic timed systems

    Get PDF
    The formal methods approach to develop reliable and efficient safety- or performance-critical systems is to construct mathematically precise models of such systems on which properties of interest, such as safety guarantees or performance requirements, can be verified automatically. In this thesis, we present techniques that extend the reach of exhaustive and statistical model checking to verify reachability and reward-based properties of compositional behavioural models that support quantitative aspects such as real time and randomised decisions. We present two techniques that allow sound statistical model checking for the nondeterministic-randomised model of Markov decision processes. We investigate the relationship between two different definitions of the model of probabilistic timed automata, as well as potential ways to apply statistical model checking. Stochastic timed automata allow nondeterministic choices as well as nondeterministic and stochastic delays, and we present the first exhaustive model checking algorithm that allows their analysis. All the approaches introduced in this thesis are implemented as part of the Modest Toolset, which supports the construction and verification of models specified in the formal modelling language Modest. We conclude by applying this language and toolset to study novel distributed control strategies for photovoltaic microgenerators
    • …
    corecore