163 research outputs found
Iterated Regret Minimization in Game Graphs
Iterated regret minimization has been introduced recently by J.Y. Halpern and
R. Pass in classical strategic games. For many games of interest, this new
solution concept provides solutions that are judged more reasonable than
solutions offered by traditional game concepts -- such as Nash equilibrium --.
Although computing iterated regret on explicit matrix game is conceptually and
computationally easy, nothing is known about computing the iterated regret on
games whose matrices are defined implicitly using game tree, game DAG or, more
generally game graphs. In this paper, we investigate iterated regret
minimization for infinite duration two-player quantitative non-zero sum games
played on graphs.
We consider reachability objectives that are not necessarily antagonist.
Edges are weighted by integers -- one for each player --, and the payoffs are
defined by the sum of the weights along the paths. Depending on the class of
graphs, we give either polynomial or pseudo-polynomial time algorithms to
compute a strategy that minimizes the regret for a fixed player. We finally
give algorithms to compute the strategies of the two players that minimize the
iterated regret for trees, and for graphs with strictly positive weights only.Comment: 19 pages. Bug in introductive example fixed
Non-Zero Sum Games for Reactive Synthesis
In this invited contribution, we summarize new solution concepts useful for
the synthesis of reactive systems that we have introduced in several recent
publications. These solution concepts are developed in the context of non-zero
sum games played on graphs. They are part of the contributions obtained in the
inVEST project funded by the European Research Council.Comment: LATA'16 invited pape
The Impatient May Use Limited Optimism to Minimize Regret
Discounted-sum games provide a formal model for the study of reinforcement
learning, where the agent is enticed to get rewards early since later rewards
are discounted. When the agent interacts with the environment, she may regret
her actions, realizing that a previous choice was suboptimal given the behavior
of the environment. The main contribution of this paper is a PSPACE algorithm
for computing the minimum possible regret of a given game. To this end, several
results of independent interest are shown. (1) We identify a class of
regret-minimizing and admissible strategies that first assume that the
environment is collaborating, then assume it is adversarial---the precise
timing of the switch is key here. (2) Disregarding the computational cost of
numerical analysis, we provide an NP algorithm that checks that the regret
entailed by a given time-switching strategy exceeds a given value. (3) We show
that determining whether a strategy minimizes regret is decidable in PSPACE
Reactive Synthesis Without Regret
Two-player zero-sum games of infinite duration and their quantitative versions are used in verification to model the interaction between a controller (Eve) and its environment (Adam). The question usually addressed is that of the existence (and computability) of a strategy for Eve that can maximize her payoff against any strategy of Adam. In this work, we are interested in strategies of Eve that minimize her regret, i.e. strategies that minimize the difference between her actual payoff and the payoff she could have achieved if she had known the strategy of Adam in advance. We give algorithms to compute the strategies of Eve that ensure minimal regret against an adversary whose choice of strategy is (i) unrestricted, (ii) limited to positional strategies, or (iii) limited to word strategies, and show that the two last cases have natural modelling applications. We also show that our notion of regret minimization in which Adam is limited to word strategies generalizes the notion of good for games introduced by Henzinger and Piterman, and is related to the notion of determinization by pruning due to Aminof, Kupferman and Lampert
- …