761 research outputs found
Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games
Regret minimization is a powerful tool for solving large-scale extensive-form
games. State-of-the-art methods rely on minimizing regret locally at each
decision point. In this work we derive a new framework for regret minimization
on sequential decision problems and extensive-form games with general compact
convex sets at each decision point and general convex losses, as opposed to
prior work which has been for simplex decision points and linear losses. We
call our framework laminar regret decomposition. It generalizes the CFR
algorithm to this more general setting. Furthermore, our framework enables a
new proof of CFR even in the known setting, which is derived from a perspective
of decomposing polytope regret, thereby leading to an arguably simpler
interpretation of the algorithm. Our generalization to convex compact sets and
convex losses allows us to develop new algorithms for several problems:
regularized sequential decision making, regularized Nash equilibria in
extensive-form games, and computing approximate extensive-form perfect
equilibria. Our generalization also leads to the first regret-minimization
algorithm for computing reduced-normal-form quantal response equilibria based
on minimizing local regrets. Experiments show that our framework leads to
algorithms that scale at a rate comparable to the fastest variants of
counterfactual regret minimization for computing Nash equilibrium, and
therefore our approach leads to the first algorithm for computing quantal
response equilibria in extremely large games. Finally we show that our
framework enables a new kind of scalable opponent exploitation approach
Smoothing Method for Approximate Extensive-Form Perfect Equilibrium
Nash equilibrium is a popular solution concept for solving
imperfect-information games in practice. However, it has a major drawback: it
does not preclude suboptimal play in branches of the game tree that are not
reached in equilibrium. Equilibrium refinements can mend this issue, but have
experienced little practical adoption. This is largely due to a lack of
scalable algorithms.
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective algorithms for computing Nash equilibria in
large-scale two-player zero-sum extensive-form games. In this paper, we
provide, to our knowledge, the first extension of these methods to equilibrium
refinements. We develop a smoothing approach for behavioral perturbations of
the convex polytope that encompasses the strategy spaces of players in an
extensive-form game. This enables one to compute an approximate variant of
extensive-form perfect equilibria. Experiments show that our smoothing approach
leads to solutions with dramatically stronger strategies at information sets
that are reached with low probability in approximate Nash equilibria, while
retaining the overall convergence rate associated with fast algorithms for Nash
equilibrium. This has benefits both in approximate equilibrium finding (such
approximation is necessary in practice in large games) where some probabilities
are low while possibly heading toward zero in the limit, and exact equilibrium
computation where the low probabilities are actually zero.Comment: Published at IJCAI 1
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective in solving large-scale two-player zero-sum
extensive-form games. The convergence rates of these methods depend heavily on
the properties of the distance-generating function that they are based on. We
investigate the acceleration of first-order methods for solving extensive-form
games through better design of the dilated entropy function---a class of
distance-generating functions related to the domains associated with the
extensive-form games. By introducing a new weighting scheme for the dilated
entropy function, we develop the first distance-generating function for the
strategy spaces of sequential games that has no dependence on the branching
factor of the player. This result improves the convergence rate of several
first-order methods by a factor of , where is the branching
factor of the player, and is the depth of the game tree.
Thus far, counterfactual regret minimization methods have been faster in
practice, and more popular, than first-order methods despite their
theoretically inferior convergence rates. Using our new weighting scheme and
practical tuning we show that, for the first time, the excessive gap technique
can be made faster than the fastest counterfactual regret minimization
algorithm, CFR+, in practice
A Unified View of Large-scale Zero-sum Equilibrium Computation
The task of computing approximate Nash equilibria in large zero-sum
extensive-form games has received a tremendous amount of attention due mainly
to the Annual Computer Poker Competition. Immediately after its inception, two
competing and seemingly different approaches emerged---one an application of
no-regret online learning, the other a sophisticated gradient method applied to
a convex-concave saddle-point formulation. Since then, both approaches have
grown in relative isolation with advancements on one side not effecting the
other. In this paper, we rectify this by dissecting and, in a sense, unify the
two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio
Computing large market equilibria using abstractions
Computing market equilibria is an important practical problem for market
design (e.g. fair division, item allocation). However, computing equilibria
requires large amounts of information (e.g. all valuations for all buyers for
all items) and compute power. We consider ameliorating these issues by applying
a method used for solving complex games: constructing a coarsened abstraction
of a given market, solving for the equilibrium in the abstraction, and lifting
the prices and allocations back to the original market. We show how to bound
important quantities such as regret, envy, Nash social welfare, Pareto
optimality, and maximin share when the abstracted prices and allocations are
used in place of the real equilibrium. We then study two abstraction methods of
interest for practitioners: 1) filling in unknown valuations using techniques
from matrix completion, 2) reducing the problem size by aggregating groups of
buyers/items into smaller numbers of representative buyers/items and solving
for equilibrium in this coarsened market. We find that in real data
allocations/prices that are relatively close to equilibria can be computed from
even very coarse abstractions
Recommended from our members
Using EPECs to model bilevel games in restructured electricity markets with locational prices
CWPE0619 (EPRG0602) Xinmin Hu and Daniel Ralph (Feb 2006) Using EPECs to model bilevel games in restructured electricity markets with locational prices We study a bilevel noncooperative game-theoretic model of electricity markets with locational marginal prices. Each player faces a bilevel optimization problem that we remodel as a mathematical program with equilibrium constraints, MPEC. This gives an EPEC, equilibrium problem with equilibrium constraints. We establish sufficient conditions for existence of pure strategy Nash equilibria for this class of bilevel games and give some applications. We show by examples the effect of network transmission limits, i.e. congestion, on existence of equilibria. Then we study, for more general EPECs, the weaker pure strategy concepts of local Nash and Nash stationary equilibria. We model the latter via complementarity problems, CPs. Finally, we present numerical examples of methods that attempt to find local Nash or Nash stationary equilibria of randomly generated electricity market games. The CP solver PATH is found to be rather effective in this context
- …