3,715 research outputs found

    Policy Iteration-based Conditional Termination and Ranking Functions

    No full text
    The final publication is available at link.springer.com.International audienceTermination analyzers generally synthesize ranking functions or relations, which represent checkable proofs of their results. In [], we proposed an approach for conditional termination analysis based on abstract fixpoint computation by policy iteration. This method is not based on ranking functions and does not directly provide a ranking relation, which makes the comparison with existing approaches difficult. In this paper we study the relationships between our approach and ranking functions and relations, focusing on extensions of linear ranking functions. We show that it can work on programs admitting a specific kind of segmented ranking functions, and that the results can be checked by the construction of a disjunctive ranking relation. Experimental results show the interest of this approach

    Proving termination through conditional termination

    Get PDF
    We present a constraint-based method for proving conditional termination of integer programs. Building on this, we construct a framework to prove (unconditional) program termination using a powerful mechanism to combine conditional termination proofs. Our key insight is that a conditional termination proof shows termination for a subset of program execution states which do not need to be considered in the remaining analysis. This facilitates more effective termination as well as non-termination analyses, and allows handling loops with different execution phases naturally. Moreover, our method can deal with sequences of loops compositionally. In an empirical evaluation, we show that our implementation VeryMax outperforms state-of-the-art tools on a range of standard benchmarks.Peer ReviewedPostprint (author's final draft

    Bayesian learning of noisy Markov decision processes

    Full text link
    We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller

    Engineering a Conformant Probabilistic Planner

    Full text link
    We present a partial-order, conformant, probabilistic planner, Probapop which competed in the blind track of the Probabilistic Planning Competition in IPC-4. We explain how we adapt distance based heuristics for use with probabilistic domains. Probapop also incorporates heuristics based on probability of success. We explain the successes and difficulties encountered during the design and implementation of Probapop

    Synthesising interprocedural bit-precise termination proofs

    Get PDF
    Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision

    Learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic Systems

    Full text link
    We consider the problem of learning control policies in discrete-time stochastic systems which guarantee that the system stabilizes within some specified stabilization region with probability~11. Our approach is based on the novel notion of stabilizing ranking supermartingales (sRSMs) that we introduce in this work. Our sRSMs overcome the limitation of methods proposed in previous works whose applicability is restricted to systems in which the stabilizing region cannot be left once entered under any control policy. We present a learning procedure that learns a control policy together with an sRSM that formally certifies probability~11 stability, both learned as neural networks. We show that this procedure can also be adapted to formally verifying that, under a given Lipschitz continuous control policy, the stochastic system stabilizes within some stabilizing region with probability~11. Our experimental evaluation shows that our learning procedure can successfully learn provably stabilizing policies in practice.Comment: Accepted at ATVA 2023. Follow-up work of arXiv:2112.0949

    Sequential Design for Ranking Response Surfaces

    Full text link
    We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given L2L \ge 2 response surfaces over a continuous input space X\cal X, the aim is to efficiently find the index of the minimal response across the entire X\cal X. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures
    corecore