9,563 research outputs found
A Dynamic Boundary Guarding Problem with Translating Targets
We introduce a problem in which a service vehicle seeks to guard a deadline
(boundary) from dynamically arriving mobile targets. The environment is a
rectangle and the deadline is one of its edges. Targets arrive continuously
over time on the edge opposite the deadline, and move towards the deadline at a
fixed speed. The goal for the vehicle is to maximize the fraction of targets
that are captured before reaching the deadline. We consider two cases; when the
service vehicle is faster than the targets, and; when the service vehicle is
slower than the targets. In the first case we develop a novel vehicle policy
based on computing longest paths in a directed acyclic graph. We give a lower
bound on the capture fraction of the policy and show that the policy is optimal
when the distance between the target arrival edge and deadline becomes very
large. We present numerical results which suggest near optimal performance away
from this limiting regime. In the second case, when the targets are slower than
the vehicle, we propose a policy based on servicing fractions of the
translational minimum Hamiltonian path. In the limit of low target speed and
high arrival rate, the capture fraction of this policy is within a small
constant factor of the optimal.Comment: Extended version of paper for the joint 48th IEEE Conference on
Decision and Control and 28th Chinese Control Conferenc
Value Iteration for Long-run Average Reward in Markov Decision Processes
Markov decision processes (MDPs) are standard models for probabilistic
systems with non-deterministic behaviours. Long-run average rewards provide a
mathematically elegant formalism for expressing long term performance. Value
iteration (VI) is one of the simplest and most efficient algorithmic approaches
to MDPs with other properties, such as reachability objectives. Unfortunately,
a naive extension of VI does not work for MDPs with long-run average rewards,
as there is no known stopping criterion. In this work our contributions are
threefold. (1) We refute a conjecture related to stopping criteria for MDPs
with long-run average rewards. (2) We present two practical algorithms for MDPs
with long-run average rewards based on VI. First, we show that a combination of
applying VI locally for each maximal end-component (MEC) and VI for
reachability objectives can provide approximation guarantees. Second, extending
the above approach with a simulation-guided on-demand variant of VI, we present
an anytime algorithm that is able to deal with very large models. (3) Finally,
we present experimental results showing that our methods significantly
outperform the standard approaches on several benchmarks
Improving search order for reachability testing in timed automata
Standard algorithms for reachability analysis of timed automata are sensitive
to the order in which the transitions of the automata are taken. To tackle this
problem, we propose a ranking system and a waiting strategy. This paper
discusses the reason why the search order matters and shows how a ranking
system and a waiting strategy can be integrated into the standard reachability
algorithm to alleviate and prevent the problem respectively. Experiments show
that the combination of the two approaches gives optimal search order on
standard benchmarks except for one example. This suggests that it should be
used instead of the standard BFS algorithm for reachability analysis of timed
automata
- …