Search CORE

22,174 research outputs found

Expected Window Mean-Payoff

Author: Bordais Benjamin
Guha Shibashis
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)
Publication date: 01/01/2019
Field of study

We study the expected value of the window mean-payoff measure in Markov decision processes (MDPs) and Markov chains (MCs). The window mean-payoff measure strengthens the classical mean-payoff measure by measuring the mean-payoff over a window of bounded length that slides along an infinite path. This measure ensures better stability properties than the classical mean-payoff. Window mean-payoff has been introduced previously for two-player zero-sum games. As in the case of games, we study several variants of this definition: the measure can be defined to be prefix-independent or not, and for a fixed window length or for a window length that is left parametric. For fixed window length, we provide polynomial time algorithms for the prefix-independent version for both MDPs and MCs. When the length is left parametric, the problem of computing the expected value on MDPs is as hard as computing the mean-payoff value in two-player zero-sum games, a problem for which it is not known if it can be solved in polynomial time. For the prefix-dependent version, surprisingly, the expected window mean-payoff value cannot be computed in polynomial time unless P=PSPACE. For the parametric case and the prefix-dependent case, we manage to obtain algorithms with better complexities for MCs

Dagstuhl Research Online Publication Server

Fictitious Play with Time-Invariant Frequency Update for Network Security

Author: Alpcan Tansu
Başar Tamer
Nguyen Kien C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

We study two-player security games which can be viewed as sequences of nonzero-sum matrix games played by an Attacker and a Defender. The evolution of the game is based on a stochastic fictitious play process, where players do not have access to each other's payoff matrix. Each has to observe the other's actions up to present and plays the action generated based on the best response to these observations. In a regular fictitious play process, each player makes a maximum likelihood estimate of her opponent's mixed strategy, which results in a time-varying update based on the previous estimate and current action. In this paper, we explore an alternative scheme for frequency update, whose mean dynamic is instead time-invariant. We examine convergence properties of the mean dynamic of the fictitious play process with such an update scheme, and establish local stability of the equilibrium point when both players are restricted to two actions. We also propose an adaptive algorithm based on this time-invariant frequency update.Comment: Proceedings of the 2010 IEEE Multi-Conference on Systems and Control (MSC10), September 2010, Yokohama, Japa

arXiv.org e-Print Archive

CiteSeerX

Crossref

Rules of Thumb for Social Learning

Author: D. Fudenberg
G. Ellison
Publication venue
Publication date
Field of study

Research Papers in Economics

On the complexity of heterogeneous multidimensional quantitative games

Author: Bruyère Véronique
Hautem Quentin
Raskin Jean-François
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we study two-player zero-sum turn-based games played on a finite multidimensional weighted graph. In recent papers all dimensions use the same measure, whereas here we allow to combine different measures. Such heterogeneous multidimensional quantitative games provide a general and natural model for the study of reactive system synthesis. We focus on classical measures like the Inf, Sup, LimInf, and LimSup of the weights seen along the play, as well as on the window mean-payoff (WMP) measure. This new measure is a natural strengthening of the mean-payoff measure. We allow objectives defined as Boolean combinations of heterogeneous constraints. While multidimensional games with Boolean combinations of mean-payoff constraints are undecidable, we show that the problem becomes EXPTIME-complete for DNF/CNF Boolean combinations of heterogeneous measures taken among {WMP, Inf, Sup, LimInf, LimSup} and that exponential memory strategies are sufficient for both players to win. We provide a detailed study of the complexity and the memory requirements when the Boolean combination of the measures is replaced by an intersection. EXPTIME-completeness and exponential memory strategies still hold for the intersection of measures in {WMP, Inf, Sup, LimInf, LimSup}, and we get PSPACE-completeness when WMP measure is no longer considered. To avoid EXPTIME-or PSPACE-hardness, we impose at most one occurrence of WMP measure and fix the number of Sup measures, and we propose several refinements (on the number of occurrences of the other measures) for which we get polynomial algorithms and lower memory requirements. For all the considered classes of games, we also study parameterized complexity

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

DI-fusion