Search CORE

298 research outputs found

Iterated Regret Minimization in Game Graphs

Author: Filiot Emmanuel
Gall Tristan Le
Raskin Jean-François
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Iterated regret minimization has been introduced recently by J.Y. Halpern and R. Pass in classical strategic games. For many games of interest, this new solution concept provides solutions that are judged more reasonable than solutions offered by traditional game concepts -- such as Nash equilibrium --. Although computing iterated regret on explicit matrix game is conceptually and computationally easy, nothing is known about computing the iterated regret on games whose matrices are defined implicitly using game tree, game DAG or, more generally game graphs. In this paper, we investigate iterated regret minimization for infinite duration two-player quantitative non-zero sum games played on graphs. We consider reachability objectives that are not necessarily antagonist. Edges are weighted by integers -- one for each player --, and the payoffs are defined by the sum of the weights along the paths. Depending on the class of graphs, we give either polynomial or pseudo-polynomial time algorithms to compute a strategy that minimizes the regret for a fixed player. We finally give algorithms to compute the strategies of the two players that minimize the iterated regret for trees, and for graphs with strictly positive weights only.Comment: 19 pages. Bug in introductive example fixed

arXiv.org e-Print Archive

Crossref

DI-fusion

Non-Zero Sum Games for Reactive Synthesis

Author: A Brandenburger
A Ehrenfeucht
B Aminof
C Baier
C Wu
D Berwanger
D Fisman
E Filiot
EM Clarke
J Filar
J Nash
J-P Queille
JA Filar
JY Halpern
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
L Brim
L Khachiyan
M Faella
M Puterman
M Randour
M Randour
M Ummels
O Kupferman
TA Henzinger
U Zwick
W Damm
Publication venue
Publication date: 17/12/2015
Field of study

In this invited contribution, we summarize new solution concepts useful for the synthesis of reactive systems that we have introduced in several recent publications. These solution concepts are developed in the context of non-zero sum games played on graphs. They are part of the contributions obtained in the inVEST project funded by the European Research Council.Comment: LATA'16 invited pape

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Institutional Repository Universiteit Antwerpen

DI-fusion

HAL-Rennes 1

HAL - UPEC / UPEM

The Impatient May Use Limited Optimism to Minimize Regret

Author: B Aminof
C Reutenauer
CJCH Watkins
E Allender
E Filiot
F Cucker
J Filar
JY Halpern
KR Apt
L Alfaro de
LS Shapley
M Jurdzinski
ML Puterman
P Hunter
R Brenguier
U Zwick
Publication venue
Publication date: 17/11/2018
Field of study

Discounted-sum games provide a formal model for the study of reinforcement learning, where the agent is enticed to get rewards early since later rewards are discounted. When the agent interacts with the environment, she may regret her actions, realizing that a previous choice was suboptimal given the behavior of the environment. The main contribution of this paper is a PSPACE algorithm for computing the minimum possible regret of a given game. To this end, several results of independent interest are shown. (1) We identify a class of regret-minimizing and admissible strategies that first assume that the environment is collaborating, then assume it is adversarial---the precise timing of the switch is key here. (2) Disregarding the computational cost of numerical analysis, we provide an NP algorithm that checks that the regret entailed by a given time-switching strategy exceeds a given value. (3) We show that determining whether a strategy minimizes regret is decidable in PSPACE

arXiv.org e-Print Archive

Crossref

Institutional Repository Universiteit Antwerpen

DI-fusion

Reactive Synthesis Without Regret

Author: Hunter Paul
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Conference on Concurrency Theory (CONCUR 2015)
Publication date: 01/01/2015
Field of study

Two-player zero-sum games of infinite duration and their quantitative versions are used in verification to model the interaction between a controller (Eve) and its environment (Adam). The question usually addressed is that of the existence (and computability) of a strategy for Eve that can maximize her payoff against any strategy of Adam. In this work, we are interested in strategies of Eve that minimize her regret, i.e. strategies that minimize the difference between her actual payoff and the payoff she could have achieved if she had known the strategy of Adam in advance. We give algorithms to compute the strategies of Eve that ensure minimal regret against an adversary whose choice of strategy is (i) unrestricted, (ii) limited to positional strategies, or (iii) limited to word strategies, and show that the two last cases have natural modelling applications. We also show that our notion of regret minimization in which Adam is limited to word strategies generalizes the notion of good for games introduced by Henzinger and Piterman, and is related to the notion of determinization by pruning due to Aminof, Kupferman and Lampert

Dagstuhl Research Online Publication Server

The Complexity of Admissibility in Omega-Regular Games

Author: Berwanger D.
Dawar A.
Hunter P.
Mogavero F.
Osborne M. J.
Ummels M.
Ummels M.
Publication venue
Publication date: 01/01/2014
Field of study

Iterated admissibility is a well-known and important concept in classical game theory, e.g. to determine rational behaviors in multi-player matrix games. As recently shown by Berwanger, this concept can be soundly extended to infinite games played on graphs with omega-regular objectives. In this paper, we study the algorithmic properties of this concept for such games. We settle the exact complexity of natural decision problems on the set of strategies that survive iterated elimination of dominated strategies. As a byproduct of our construction, we obtain automata which recognize all the possible outcomes of such strategies

arXiv.org e-Print Archive

Crossref

DI-fusion

HAL - UPEC / UPEM

Minimizing Regret in Discounted-Sum Games

Author: Hunter Paul
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 25th EACSL Annual Conference on Computer Science Logic (CSL 2016)
Publication date: 01/01/2016
Field of study

In this paper, we study the problem of minimizing regret in discounted-sum games played on weighted game graphs. We give algorithms for the general problem of computing the minimal regret of the controller (Eve) as well as several variants depending on which strategies the environment (Adam) is permitted to use. We also consider the problem of synthesizing regret-free strategies for Eve in each of these scenarios

Dagstuhl Research Online Publication Server

Deep Counterfactual Regret Minimization in Continuous Action Space

Author: Kattainen Emil
Publication venue
Publication date: 04/05/2022
Field of study

Counterfactual regret minimization based algorithms are used as the state-of-the-art solutions for various problems within imperfect-information games. Deep learning has seen a multitude of uses in recent years. Recently deep learning has been combined with counterfactual regret minimization to increase the generality of the counterfactual regret minimization algorithms. This thesis proposes a new way of increasing the generality of the counterfactual regret minimization algorithms even further by increasing the role of neural networks. In addition, to combat the variance caused by the use of neural networks, a new way of sampling is introduced to reduce the variance. These proposed modifications were compared against baseline algorithms. The proposed way of reducing variance improved the performance of counterfactual regret minimization, while the method for increasing generality was found to be lacking especially when scaling the baseline model. Possible reasons for this are discussed and future research ideas are offered

Trepo - Institutional Repository of Tampere University