12 research outputs found
Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games
Regret minimization is a powerful tool for solving large-scale extensive-form
games. State-of-the-art methods rely on minimizing regret locally at each
decision point. In this work we derive a new framework for regret minimization
on sequential decision problems and extensive-form games with general compact
convex sets at each decision point and general convex losses, as opposed to
prior work which has been for simplex decision points and linear losses. We
call our framework laminar regret decomposition. It generalizes the CFR
algorithm to this more general setting. Furthermore, our framework enables a
new proof of CFR even in the known setting, which is derived from a perspective
of decomposing polytope regret, thereby leading to an arguably simpler
interpretation of the algorithm. Our generalization to convex compact sets and
convex losses allows us to develop new algorithms for several problems:
regularized sequential decision making, regularized Nash equilibria in
extensive-form games, and computing approximate extensive-form perfect
equilibria. Our generalization also leads to the first regret-minimization
algorithm for computing reduced-normal-form quantal response equilibria based
on minimizing local regrets. Experiments show that our framework leads to
algorithms that scale at a rate comparable to the fastest variants of
counterfactual regret minimization for computing Nash equilibrium, and
therefore our approach leads to the first algorithm for computing quantal
response equilibria in extremely large games. Finally we show that our
framework enables a new kind of scalable opponent exploitation approach
Imperfect-Recall Abstractions with Bounds in Games
Imperfect-recall abstraction has emerged as the leading paradigm for
practical large-scale equilibrium computation in incomplete-information games.
However, imperfect-recall abstractions are poorly understood, and only weak
algorithm-specific guarantees on solution quality are known. In this paper, we
show the first general, algorithm-agnostic, solution quality guarantees for
Nash equilibria and approximate self-trembling equilibria computed in
imperfect-recall abstractions, when implemented in the original
(perfect-recall) game. Our results are for a class of games that generalizes
the only previously known class of imperfect-recall abstractions where any
results had been obtained. Further, our analysis is tighter in two ways, each
of which can lead to an exponential reduction in the solution quality error
bound.
We then show that for extensive-form games that satisfy certain properties,
the problem of computing a bound-minimizing abstraction for a single level of
the game reduces to a clustering problem, where the increase in our bound is
the distance function. This reduction leads to the first imperfect-recall
abstraction algorithm with solution quality bounds. We proceed to show a divide
in the class of abstraction problems. If payoffs are at the same scale at all
information sets considered for abstraction, the input forms a metric space.
Conversely, if this condition is not satisfied, we show that the input does not
form a metric space. Finally, we use these results to experimentally
investigate the quality of our bound for single-level abstraction
Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
National Research Foundation (NRF) Singapore under SMART and Future Mobilit
Sampling based approaches for minimizing regret in uncertain Markov Decision Problems (MDPs)
National Research Foundation (NRF) Singapore under Singapore-MIT Alliance for Research and Technology (SMART) Center for Future Mobilit