15 research outputs found

    Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead

    Full text link
    Stackelberg equilibria have become increasingly important as a solution concept in computational game theory, largely inspired by practical problems such as security settings. In practice, however, there is typically uncertainty regarding the model about the opponent. This paper is, to our knowledge, the first to investigate Stackelberg equilibria under uncertainty in extensive-form games, one of the broadest classes of game. We introduce robust Stackelberg equilibria, where the uncertainty is about the opponent's payoffs, as well as ones where the opponent has limited lookahead and the uncertainty is about the opponent's node evaluation function. We develop a new mixed-integer program for the deterministic limited-lookahead setting. We then extend the program to the robust setting for Stackelberg equilibrium under unlimited and under limited lookahead by the opponent. We show that for the specific case of interval uncertainty about the opponent's payoffs (or about the opponent's node evaluations in the case of limited lookahead), robust Stackelberg equilibria can be computed with a mixed-integer program that is of the same asymptotic size as that for the deterministic setting.Comment: Published at AAAI1

    Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead

    Full text link
    Stackelberg equilibria have become increasingly important as a solution concept in computational game theory, largely inspired by practical problems such as security settings. In practice, however, there is typically uncertainty regarding the model about the opponent. This paper is, to our knowledge, the first to investigate Stackelberg equilibria under uncertainty in extensive-form games, one of the broadest classes of game. We introduce robust Stackelberg equilibria, where the uncertainty is about the opponent's payoffs, as well as ones where the opponent has limited lookahead and the uncertainty is about the opponent's node evaluation function. We develop a new mixed-integer program for the deterministic limited-lookahead setting. We then extend the program to the robust setting for Stackelberg equilibrium under unlimited and under limited lookahead by the opponent. We show that for the specific case of interval uncertainty about the opponent's payoffs (or about the opponent's node evaluations in the case of limited lookahead), robust Stackelberg equilibria can be computed with a mixed-integer program that is of the same asymptotic size as that for the deterministic setting.Comment: Published at AAAI1

    Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

    Full text link
    Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/

    Double-oracle sampling method for Stackelberg Equilibrium approximation in general-sum extensive-form games

    Full text link
    The paper presents a new method for approximating Strong Stackelberg Equilibrium in general-sum sequential games with imperfect information and perfect recall. The proposed approach is generic as it does not rely on any specific properties of a particular game model. The method is based on iterative interleaving of the two following phases: (1) guided Monte Carlo Tree Search sampling of the Follower's strategy space and (2) building the Leader's behavior strategy tree for which the sampled Follower's strategy is an optimal response. The above solution scheme is evaluated with respect to expected Leader's utility and time requirements on three sets of interception games with variable characteristics, played on graphs. A comparison with three state-of-the-art MILP/LP-based methods shows that in vast majority of test cases proposed simulation-based approach leads to optimal Leader's strategies, while excelling the competitive methods in terms of better time scalability and lower memory requirements

    Theoretical and Practical Advances on Smoothing for Extensive-Form Games

    Full text link
    Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate the acceleration of first-order methods for solving extensive-form games through better design of the dilated entropy function---a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player. This result improves the convergence rate of several first-order methods by a factor of Ω(bdd)\Omega(b^dd), where bb is the branching factor of the player, and dd is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than first-order methods despite their theoretically inferior convergence rates. Using our new weighting scheme and practical tuning we show that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFR+, in practice

    Equilibrium Computation and Robust Optimization in Zero Sum Games with Submodular Structure

    Full text link
    We define a class of zero-sum games with combinatorial structure, where the best response problem of one player is to maximize a submodular function. For example, this class includes security games played on networks, as well as the problem of robustly optimizing a submodular function over the worst case from a set of scenarios. The challenge in computing equilibria is that both players' strategy spaces can be exponentially large. Accordingly, previous algorithms have worst-case exponential runtime and indeed fail to scale up on practical instances. We provide a pseudopolynomial-time algorithm which obtains a guaranteed (1−1/e)2(1 - 1/e)^2-approximate mixed strategy for the maximizing player. Our algorithm only requires access to a weakened version of a best response oracle for the minimizing player which runs in polynomial time. Experimental results for network security games and a robust budget allocation problem confirm that our algorithm delivers near-optimal solutions and scales to much larger instances than was previously possible.Comment: 20 pages, 8 figures. A shorter version of this paper appears at AAAI 201
    corecore