472 research outputs found
Stochastic Games with Disjunctions of Multiple Objectives (Technical Report)
Stochastic games combine controllable and adversarial non-determinism with
stochastic behavior and are a common tool in control, verification and
synthesis of reactive systems facing uncertainty. Multi-objective stochastic
games are natural in situations where several - possibly conflicting -
performance criteria like time and energy consumption are relevant. Such
conjunctive combinations are the most studied multi-objective setting in the
literature. In this paper, we consider the dual disjunctive problem. More
concretely, we study turn-based stochastic two-player games on graphs where the
winning condition is to guarantee at least one reachability or safety objective
from a given set of alternatives. We present a fine-grained overview of
strategy and computational complexity of such \emph{disjunctive queries} (DQs)
and provide new lower and upper bounds for several variants of the problem,
significantly extending previous works. We also propose a novel value
iteration-style algorithm for approximating the set of Pareto optimal
thresholds for a given DQ.Comment: Technical report including appendix with detailed proofs, 29 page
IST Austria Technical Report
We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives.
There have been two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the satisfaction semantics, where the goal is to maximize the probability of runs such that the mean-payoff value stays above a given vector.
We consider the problem where the goal is to optimize the expectation under the constraint that the satisfaction semantics is ensured, and thus consider a generalization that unifies the existing semantics.
Our problem captures the notion of optimization with respect to strategies that are risk-averse (i.e., ensures certain probabilistic guarantee).
Our main results are algorithms for the decision problem which are always polynomial in the size of the MDP. We also show that an approximation of the Pareto-curve can be computed in time polynomial in the size of the MDP, and the approximation factor, but exponential in the number of dimensions.
Finally, we present a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve our problem
IST Austria Technical Report
We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives.
There have been two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the satisfaction semantics, where the goal is to maximize the probability of runs such that the mean-payoff value stays above a given vector.
We consider the problem where the goal is to optimize the expectation under the constraint that the satisfaction semantics is ensured, and thus consider a generalization that unifies the existing semantics. Our problem captures the notion of optimization with respect to strategies that are risk-averse (i.e., ensures certain probabilistic guarantee).
Our main results are algorithms for the decision problem which are always polynomial in the size of the MDP.
We also show that an approximation of the Pareto-curve can be computed in time polynomial in the size of the MDP, and the approximation factor, but exponential in the number of dimensions. Finally, we present a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve our problem
IST Austria Technical Report
We consider Markov decision processes (MDPs) which are a standard model for probabilistic systems. We focus on qualitative properties for MDPs that can express that desired behaviors of the system arise almost-surely (with probability 1) or with positive probability.
We introduce a new simulation relation to capture the refinement relation of MDPs with respect to qualitative properties, and present discrete graph theoretic algorithms with quadratic complexity to compute the simulation relation.
We present an automated technique for assume-guarantee style reasoning for compositional analysis of MDPs with qualitative properties by giving a counter-example guided abstraction-refinement approach to compute our new simulation relation. We have implemented our algorithms and show that the compositional analysis leads to significant improvements
Stochastic Games with Lexicographic Reachability-Safety Objectives
We study turn-based stochastic zero-sum games with lexicographic preferences
over reachability and safety objectives. Stochastic games are standard models
in control, verification, and synthesis of stochastic reactive systems that
exhibit both randomness as well as angelic and demonic non-determinism.
Lexicographic order allows to consider multiple objectives with a strict
preference order over the satisfaction of the objectives. To the best of our
knowledge, stochastic games with lexicographic objectives have not been studied
before. We establish determinacy of such games and present strategy and
computational complexity results. For strategy complexity, we show that
lexicographically optimal strategies exist that are deterministic and memory is
only required to remember the already satisfied and violated objectives. For a
constant number of objectives, we show that the relevant decision problem is in
NP coNP, matching the current known bound for single objectives; and in
general the decision problem is PSPACE-hard and can be solved in NEXPTIME
coNEXPTIME. We present an algorithm that computes the lexicographically
optimal strategies via a reduction to computation of optimal strategies in a
sequence of single-objectives games. We have implemented our algorithm and
report experimental results on various case studies.Comment: Full version (33 pages) of CAV20 conference paper; including an
appendix with technical proof
Multi-Objective Constraint Satisfaction for Mobile Robot Area Defense
In developing multi-robot cooperative systems, there are often competing objectives that need to be met. For example in automating area defense systems, multiple robots must work together to explore the entire area, and maintain consistent communications to alert the other agents and ensure trust in the system. This research presents an algorithm that tasks robots to meet the two specific goals of exploration and communication maintenance in an uncoordinated environment reducing the need for a user to pre-balance the objectives. This multi-objective problem is defined as a constraint satisfaction problem solved using the Non-dominated Sorting Genetic Algorithm II (NSGA-II). Both goals of exploration and communication maintenance are described as fitness functions in the algorithm that would satisfy their corresponding constraints. The exploration fitness was described in three ways to diversify the way exploration was measured, whereas the communication maintenance fitness was calculated as the number of independent clusters of agents. Applying the algorithm to the area defense problem, results show exploration and communication without coordination are two diametrically opposed goals, in which one may be favored, but only at the expense of the other. This work also presents suggestions for anyone looking to take further steps in developing a physically grounded solution to this area defense problem
LNCS
We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP∩coNP , matching the current known bound for single objectives; and in general the decision problem is PSPACE -hard and can be solved in NEXPTIME∩coNEXPTIME . We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies
Mixing Probabilistic and non-Probabilistic Objectives in Markov Decision Processes
In this paper, we consider algorithms to decide the existence of strategies
in MDPs for Boolean combinations of objectives. These objectives are
omega-regular properties that need to be enforced either surely, almost surely,
existentially, or with non-zero probability. In this setting, relevant
strategies are randomized infinite memory strategies: both infinite memory and
randomization may be needed to play optimally. We provide algorithms to solve
the general case of Boolean combinations and we also investigate relevant
subcases. We further report on complexity bounds for these problems.Comment: Paper accepted to LICS 2020 - Full versio
- …