21,455 research outputs found
Omega-Regular Reward Machines
Reinforcement learning (RL) is a powerful approach for training agents to
perform tasks, but designing an appropriate reward mechanism is critical to its
success. However, in many cases, the complexity of the learning objectives goes
beyond the capabilities of the Markovian assumption, necessitating a more
sophisticated reward mechanism. Reward machines and omega-regular languages are
two formalisms used to express non-Markovian rewards for quantitative and
qualitative objectives, respectively. This paper introduces omega-regular
reward machines, which integrate reward machines with omega-regular languages
to enable an expressive and effective reward mechanism for RL. We present a
model-free RL algorithm to compute epsilon-optimal strategies against
omega-egular reward machines and evaluate the effectiveness of the proposed
algorithm through experiments.Comment: To appear in ECAI-202
Learning to Prove Safety over Parameterised Concurrent Systems (Full Version)
We revisit the classic problem of proving safety over parameterised
concurrent systems, i.e., an infinite family of finite-state concurrent systems
that are represented by some finite (symbolic) means. An example of such an
infinite family is a dining philosopher protocol with any number n of processes
(n being the parameter that defines the infinite family). Regular model
checking is a well-known generic framework for modelling parameterised
concurrent systems, where an infinite set of configurations (resp. transitions)
is represented by a regular set (resp. regular transducer). Although verifying
safety properties in the regular model checking framework is undecidable in
general, many sophisticated semi-algorithms have been developed in the past
fifteen years that can successfully prove safety in many practical instances.
In this paper, we propose a simple solution to synthesise regular inductive
invariants that makes use of Angluin's classic L* algorithm (and its variants).
We provide a termination guarantee when the set of configurations reachable
from a given set of initial configurations is regular. We have tested L*
algorithm on standard (as well as new) examples in regular model checking
including the dining philosopher protocol, the dining cryptographer protocol,
and several mutual exclusion protocols (e.g. Bakery, Burns, Szymanski, and
German). Our experiments show that, despite the simplicity of our solution, it
can perform at least as well as existing semi-algorithms.Comment: Full version of FMCAD'17 pape
- …