298 research outputs found

    Iterated Regret Minimization in Game Graphs

    Full text link
    Iterated regret minimization has been introduced recently by J.Y. Halpern and R. Pass in classical strategic games. For many games of interest, this new solution concept provides solutions that are judged more reasonable than solutions offered by traditional game concepts -- such as Nash equilibrium --. Although computing iterated regret on explicit matrix game is conceptually and computationally easy, nothing is known about computing the iterated regret on games whose matrices are defined implicitly using game tree, game DAG or, more generally game graphs. In this paper, we investigate iterated regret minimization for infinite duration two-player quantitative non-zero sum games played on graphs. We consider reachability objectives that are not necessarily antagonist. Edges are weighted by integers -- one for each player --, and the payoffs are defined by the sum of the weights along the paths. Depending on the class of graphs, we give either polynomial or pseudo-polynomial time algorithms to compute a strategy that minimizes the regret for a fixed player. We finally give algorithms to compute the strategies of the two players that minimize the iterated regret for trees, and for graphs with strictly positive weights only.Comment: 19 pages. Bug in introductive example fixed

    Non-Zero Sum Games for Reactive Synthesis

    Get PDF
    In this invited contribution, we summarize new solution concepts useful for the synthesis of reactive systems that we have introduced in several recent publications. These solution concepts are developed in the context of non-zero sum games played on graphs. They are part of the contributions obtained in the inVEST project funded by the European Research Council.Comment: LATA'16 invited pape

    The Impatient May Use Limited Optimism to Minimize Regret

    Full text link
    Discounted-sum games provide a formal model for the study of reinforcement learning, where the agent is enticed to get rewards early since later rewards are discounted. When the agent interacts with the environment, she may regret her actions, realizing that a previous choice was suboptimal given the behavior of the environment. The main contribution of this paper is a PSPACE algorithm for computing the minimum possible regret of a given game. To this end, several results of independent interest are shown. (1) We identify a class of regret-minimizing and admissible strategies that first assume that the environment is collaborating, then assume it is adversarial---the precise timing of the switch is key here. (2) Disregarding the computational cost of numerical analysis, we provide an NP algorithm that checks that the regret entailed by a given time-switching strategy exceeds a given value. (3) We show that determining whether a strategy minimizes regret is decidable in PSPACE

    Reactive Synthesis Without Regret

    Get PDF
    Two-player zero-sum games of infinite duration and their quantitative versions are used in verification to model the interaction between a controller (Eve) and its environment (Adam). The question usually addressed is that of the existence (and computability) of a strategy for Eve that can maximize her payoff against any strategy of Adam. In this work, we are interested in strategies of Eve that minimize her regret, i.e. strategies that minimize the difference between her actual payoff and the payoff she could have achieved if she had known the strategy of Adam in advance. We give algorithms to compute the strategies of Eve that ensure minimal regret against an adversary whose choice of strategy is (i) unrestricted, (ii) limited to positional strategies, or (iii) limited to word strategies, and show that the two last cases have natural modelling applications. We also show that our notion of regret minimization in which Adam is limited to word strategies generalizes the notion of good for games introduced by Henzinger and Piterman, and is related to the notion of determinization by pruning due to Aminof, Kupferman and Lampert

    The Complexity of Admissibility in Omega-Regular Games

    Full text link
    Iterated admissibility is a well-known and important concept in classical game theory, e.g. to determine rational behaviors in multi-player matrix games. As recently shown by Berwanger, this concept can be soundly extended to infinite games played on graphs with omega-regular objectives. In this paper, we study the algorithmic properties of this concept for such games. We settle the exact complexity of natural decision problems on the set of strategies that survive iterated elimination of dominated strategies. As a byproduct of our construction, we obtain automata which recognize all the possible outcomes of such strategies

    Minimizing Regret in Discounted-Sum Games

    Get PDF
    In this paper, we study the problem of minimizing regret in discounted-sum games played on weighted game graphs. We give algorithms for the general problem of computing the minimal regret of the controller (Eve) as well as several variants depending on which strategies the environment (Adam) is permitted to use. We also consider the problem of synthesizing regret-free strategies for Eve in each of these scenarios

    Deep Counterfactual Regret Minimization in Continuous Action Space

    Get PDF
    Counterfactual regret minimization based algorithms are used as the state-of-the-art solutions for various problems within imperfect-information games. Deep learning has seen a multitude of uses in recent years. Recently deep learning has been combined with counterfactual regret minimization to increase the generality of the counterfactual regret minimization algorithms. This thesis proposes a new way of increasing the generality of the counterfactual regret minimization algorithms even further by increasing the role of neural networks. In addition, to combat the variance caused by the use of neural networks, a new way of sampling is introduced to reduce the variance. These proposed modifications were compared against baseline algorithms. The proposed way of reducing variance improved the performance of counterfactual regret minimization, while the method for increasing generality was found to be lacking especially when scaling the baseline model. Possible reasons for this are discussed and future research ideas are offered
    • …
    corecore