22,174 research outputs found
Expected Window Mean-Payoff
We study the expected value of the window mean-payoff measure in Markov decision processes (MDPs) and Markov chains (MCs). The window mean-payoff measure strengthens the classical mean-payoff measure by measuring the mean-payoff over a window of bounded length that slides along an infinite path. This measure ensures better stability properties than the classical mean-payoff. Window mean-payoff has been introduced previously for two-player zero-sum games. As in the case of games, we study several variants of this definition: the measure can be defined to be prefix-independent or not, and for a fixed window length or for a window length that is left parametric. For fixed window length, we provide polynomial time algorithms for the prefix-independent version for both MDPs and MCs. When the length is left parametric, the problem of computing the expected value on MDPs is as hard as computing the mean-payoff value in two-player zero-sum games, a problem for which it is not known if it can be solved in polynomial time. For the prefix-dependent version, surprisingly, the expected window mean-payoff value cannot be computed in polynomial time unless P=PSPACE. For the parametric case and the prefix-dependent case, we manage to obtain algorithms with better complexities for MCs
Fictitious Play with Time-Invariant Frequency Update for Network Security
We study two-player security games which can be viewed as sequences of
nonzero-sum matrix games played by an Attacker and a Defender. The evolution of
the game is based on a stochastic fictitious play process, where players do not
have access to each other's payoff matrix. Each has to observe the other's
actions up to present and plays the action generated based on the best response
to these observations. In a regular fictitious play process, each player makes
a maximum likelihood estimate of her opponent's mixed strategy, which results
in a time-varying update based on the previous estimate and current action. In
this paper, we explore an alternative scheme for frequency update, whose mean
dynamic is instead time-invariant. We examine convergence properties of the
mean dynamic of the fictitious play process with such an update scheme, and
establish local stability of the equilibrium point when both players are
restricted to two actions. We also propose an adaptive algorithm based on this
time-invariant frequency update.Comment: Proceedings of the 2010 IEEE Multi-Conference on Systems and Control
(MSC10), September 2010, Yokohama, Japa
On the complexity of heterogeneous multidimensional quantitative games
In this paper, we study two-player zero-sum turn-based games played on a
finite multidimensional weighted graph. In recent papers all dimensions use the
same measure, whereas here we allow to combine different measures. Such
heterogeneous multidimensional quantitative games provide a general and natural
model for the study of reactive system synthesis. We focus on classical
measures like the Inf, Sup, LimInf, and LimSup of the weights seen along the
play, as well as on the window mean-payoff (WMP) measure. This new measure is a
natural strengthening of the mean-payoff measure. We allow objectives defined
as Boolean combinations of heterogeneous constraints. While multidimensional
games with Boolean combinations of mean-payoff constraints are undecidable, we
show that the problem becomes EXPTIME-complete for DNF/CNF Boolean combinations
of heterogeneous measures taken among {WMP, Inf, Sup, LimInf, LimSup} and that
exponential memory strategies are sufficient for both players to win. We
provide a detailed study of the complexity and the memory requirements when the
Boolean combination of the measures is replaced by an intersection.
EXPTIME-completeness and exponential memory strategies still hold for the
intersection of measures in {WMP, Inf, Sup, LimInf, LimSup}, and we get
PSPACE-completeness when WMP measure is no longer considered. To avoid
EXPTIME-or PSPACE-hardness, we impose at most one occurrence of WMP measure and
fix the number of Sup measures, and we propose several refinements (on the
number of occurrences of the other measures) for which we get polynomial
algorithms and lower memory requirements. For all the considered classes of
games, we also study parameterized complexity
- …