3,715 research outputs found
Policy Iteration-based Conditional Termination and Ranking Functions
The final publication is available at link.springer.com.International audienceTermination analyzers generally synthesize ranking functions or relations, which represent checkable proofs of their results. In [], we proposed an approach for conditional termination analysis based on abstract fixpoint computation by policy iteration. This method is not based on ranking functions and does not directly provide a ranking relation, which makes the comparison with existing approaches difficult. In this paper we study the relationships between our approach and ranking functions and relations, focusing on extensions of linear ranking functions. We show that it can work on programs admitting a specific kind of segmented ranking functions, and that the results can be checked by the construction of a disjunctive ranking relation. Experimental results show the interest of this approach
Proving termination through conditional termination
We present a constraint-based method for proving conditional termination of integer programs. Building on this, we construct a framework to prove (unconditional) program termination using a powerful mechanism to combine conditional termination proofs. Our key insight is that a conditional termination proof shows termination for a subset of program execution states which do not need to be considered in the remaining analysis. This facilitates more effective termination as well as non-termination analyses, and allows handling loops with different execution phases naturally. Moreover, our method can deal with sequences of loops compositionally. In an empirical evaluation, we show that our implementation VeryMax outperforms state-of-the-art tools on a range of standard benchmarks.Peer ReviewedPostprint (author's final draft
Bayesian learning of noisy Markov decision processes
We consider the inverse reinforcement learning problem, that is, the problem
of learning from, and then predicting or mimicking a controller based on
state/action data. We propose a statistical model for such data, derived from
the structure of a Markov decision process. Adopting a Bayesian approach to
inference, we show how latent variables of the model can be estimated, and how
predictions about actions can be made, in a unified framework. A new Markov
chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior
distribution. This step includes a parameter expansion step, which is shown to
be essential for good convergence properties of the MCMC sampler. As an
illustration, the method is applied to learning a human controller
Engineering a Conformant Probabilistic Planner
We present a partial-order, conformant, probabilistic planner, Probapop which
competed in the blind track of the Probabilistic Planning Competition in IPC-4.
We explain how we adapt distance based heuristics for use with probabilistic
domains. Probapop also incorporates heuristics based on probability of success.
We explain the successes and difficulties encountered during the design and
implementation of Probapop
Synthesising interprocedural bit-precise termination proofs
Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision
Learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic Systems
We consider the problem of learning control policies in discrete-time
stochastic systems which guarantee that the system stabilizes within some
specified stabilization region with probability~. Our approach is based on
the novel notion of stabilizing ranking supermartingales (sRSMs) that we
introduce in this work. Our sRSMs overcome the limitation of methods proposed
in previous works whose applicability is restricted to systems in which the
stabilizing region cannot be left once entered under any control policy. We
present a learning procedure that learns a control policy together with an sRSM
that formally certifies probability~ stability, both learned as neural
networks. We show that this procedure can also be adapted to formally verifying
that, under a given Lipschitz continuous control policy, the stochastic system
stabilizes within some stabilizing region with probability~. Our
experimental evaluation shows that our learning procedure can successfully
learn provably stabilizing policies in practice.Comment: Accepted at ATVA 2023. Follow-up work of arXiv:2112.0949
Sequential Design for Ranking Response Surfaces
We propose and analyze sequential design methods for the problem of ranking
several response surfaces. Namely, given response surfaces over a
continuous input space , the aim is to efficiently find the index of
the minimal response across the entire . The response surfaces are not
known and have to be noisily sampled one-at-a-time. This setting is motivated
by stochastic control applications and requires joint experimental design both
in space and response-index dimensions. To generate sequential design
heuristics we investigate stepwise uncertainty reduction approaches, as well as
sampling based on posterior classification complexity. We also make connections
between our continuous-input formulation and the discrete framework of pure
regret in multi-armed bandits. To model the response surfaces we utilize
kriging surrogates. Several numerical examples using both synthetic data and an
epidemics control problem are provided to illustrate our approach and the
efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures
- …