761 research outputs found

    Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

    Full text link
    Regret minimization is a powerful tool for solving large-scale extensive-form games. State-of-the-art methods rely on minimizing regret locally at each decision point. In this work we derive a new framework for regret minimization on sequential decision problems and extensive-form games with general compact convex sets at each decision point and general convex losses, as opposed to prior work which has been for simplex decision points and linear losses. We call our framework laminar regret decomposition. It generalizes the CFR algorithm to this more general setting. Furthermore, our framework enables a new proof of CFR even in the known setting, which is derived from a perspective of decomposing polytope regret, thereby leading to an arguably simpler interpretation of the algorithm. Our generalization to convex compact sets and convex losses allows us to develop new algorithms for several problems: regularized sequential decision making, regularized Nash equilibria in extensive-form games, and computing approximate extensive-form perfect equilibria. Our generalization also leads to the first regret-minimization algorithm for computing reduced-normal-form quantal response equilibria based on minimizing local regrets. Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games. Finally we show that our framework enables a new kind of scalable opponent exploitation approach

    Smoothing Method for Approximate Extensive-Form Perfect Equilibrium

    Full text link
    Nash equilibrium is a popular solution concept for solving imperfect-information games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum extensive-form games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an extensive-form game. This enables one to compute an approximate variant of extensive-form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at information sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.Comment: Published at IJCAI 1

    Theoretical and Practical Advances on Smoothing for Extensive-Form Games

    Full text link
    Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate the acceleration of first-order methods for solving extensive-form games through better design of the dilated entropy function---a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player. This result improves the convergence rate of several first-order methods by a factor of Ω(bdd)\Omega(b^dd), where bb is the branching factor of the player, and dd is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than first-order methods despite their theoretically inferior convergence rates. Using our new weighting scheme and practical tuning we show that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFR+, in practice

    A Unified View of Large-scale Zero-sum Equilibrium Computation

    Full text link
    The task of computing approximate Nash equilibria in large zero-sum extensive-form games has received a tremendous amount of attention due mainly to the Annual Computer Poker Competition. Immediately after its inception, two competing and seemingly different approaches emerged---one an application of no-regret online learning, the other a sophisticated gradient method applied to a convex-concave saddle-point formulation. Since then, both approaches have grown in relative isolation with advancements on one side not effecting the other. In this paper, we rectify this by dissecting and, in a sense, unify the two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio

    Computing large market equilibria using abstractions

    Full text link
    Computing market equilibria is an important practical problem for market design (e.g. fair division, item allocation). However, computing equilibria requires large amounts of information (e.g. all valuations for all buyers for all items) and compute power. We consider ameliorating these issues by applying a method used for solving complex games: constructing a coarsened abstraction of a given market, solving for the equilibrium in the abstraction, and lifting the prices and allocations back to the original market. We show how to bound important quantities such as regret, envy, Nash social welfare, Pareto optimality, and maximin share when the abstracted prices and allocations are used in place of the real equilibrium. We then study two abstraction methods of interest for practitioners: 1) filling in unknown valuations using techniques from matrix completion, 2) reducing the problem size by aggregating groups of buyers/items into smaller numbers of representative buyers/items and solving for equilibrium in this coarsened market. We find that in real data allocations/prices that are relatively close to equilibria can be computed from even very coarse abstractions
    • …
    corecore