65,341 research outputs found
No-Regret Learning in Extensive-Form Games with Imperfect Recall
Counterfactual Regret Minimization (CFR) is an efficient no-regret learning
algorithm for decision problems modeled as extensive games. CFR's regret bounds
depend on the requirement of perfect recall: players always remember
information that was revealed to them and the order in which it was revealed.
In games without perfect recall, however, CFR's guarantees do not apply. In
this paper, we present the first regret bound for CFR when applied to a general
class of games with imperfect recall. In addition, we show that CFR applied to
any abstraction belonging to our general class results in a regret bound not
just for the abstract game, but for the full game as well. We verify our theory
and show how imperfect recall can be used to trade a small increase in regret
for a significant reduction in memory in three domains: die-roll poker, phantom
tic-tac-toe, and Bluff.Comment: 21 pages, 4 figures, expanded version of article to appear in
Proceedings of the Twenty-Ninth International Conference on Machine Learnin
Extensive-Form Perfect Equilibrium Computation in Two-Player Games
We study the problem of computing an Extensive-Form Perfect Equilibrium
(EFPE) in 2-player games. This equilibrium concept refines the Nash equilibrium
requiring resilience w.r.t. a specific vanishing perturbation (representing
mistakes of the players at each decision node). The scientific challenge is
intrinsic to the EFPE definition: it requires a perturbation over the agent
form, but the agent form is computationally inefficient, due to the presence of
highly nonlinear constraints. We show that the sequence form can be exploited
in a non-trivial way and that, for general-sum games, finding an EFPE is
equivalent to solving a suitably perturbed linear complementarity problem. We
prove that Lemke's algorithm can be applied, showing that computing an EFPE is
-complete. In the notable case of zero-sum games, the problem is
in and can be solved by linear programming. Our algorithms also
allow one to find a Nash equilibrium when players cannot perfectly control
their moves, being subject to a given execution uncertainty, as is the case in
most realistic physical settings.Comment: To appear in AAAI 1
A finite-state, finite-memory minimum principle, part 2
In part 1 of this paper, a minimum principle was found for the finite-state, finite-memory (FSFM) stochastic control problem. In part 2, conditions for the sufficiency of the minimum principle are stated in terms of the informational properties of the problem. This is accomplished by introducing the notion of a signaling strategy. Then a min-H algorithm based on the FSFM minimum principle is presented. This algorithm converges, after a finite number of steps, to a person - by - person extremal solution
Solving Imperfect Information Games Using Decomposition
Decomposition, i.e. independently analyzing possible subgames, has proven to
be an essential principle for effective decision-making in perfect information
games. However, in imperfect information games, decomposition has proven to be
problematic. To date, all proposed techniques for decomposition in imperfect
information games have abandoned theoretical guarantees. This work presents the
first technique for decomposing an imperfect information game into subgames
that can be solved independently, while retaining optimality guarantees on the
full-game solution. We can use this technique to construct theoretically
justified algorithms that make better use of information available at run-time,
overcome memory or disk limitations at run-time, or make a time/space trade-off
to overcome memory or disk limitations while solving a game. In particular, we
present an algorithm for subgame solving which guarantees performance in the
whole game, in contrast to existing methods which may have unbounded error. In
addition, we present an offline game solving algorithm, CFR-D, which can
produce a Nash equilibrium for a game that is larger than available storage.Comment: 7 pages by 2 columns, 5 figures; April 21 2014 - expand explanations
and theor
Inductive Game Theory: A Basic Scenario
The aim of this paper is to present the new theory called “inductive game theory”. A paper, published by one of the present authors with A. Matsui, discussed some part of inductive game theory in a specific game. Here, we will give a more developed discourse of the theory. The paper is written to show one entire picture of the theory: From individual raw experiences, short-term memories to long-term memories, inductive derivation of individual views, classification of such views, decision making or modification of behavior based on a view, and repercussion from the modified play in the objective game. We focus on some clear-cut cases, forgetting a lot of possible variants, but will still give a lot of results. In order to show one possible discourse as a whole, we will ask the question of how Nash equilibrium is emerging from the viewpoint of inductive game theory, and will give one answer.
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective in solving large-scale two-player zero-sum
extensive-form games. The convergence rates of these methods depend heavily on
the properties of the distance-generating function that they are based on. We
investigate the acceleration of first-order methods for solving extensive-form
games through better design of the dilated entropy function---a class of
distance-generating functions related to the domains associated with the
extensive-form games. By introducing a new weighting scheme for the dilated
entropy function, we develop the first distance-generating function for the
strategy spaces of sequential games that has no dependence on the branching
factor of the player. This result improves the convergence rate of several
first-order methods by a factor of , where is the branching
factor of the player, and is the depth of the game tree.
Thus far, counterfactual regret minimization methods have been faster in
practice, and more popular, than first-order methods despite their
theoretically inferior convergence rates. Using our new weighting scheme and
practical tuning we show that, for the first time, the excessive gap technique
can be made faster than the fastest counterfactual regret minimization
algorithm, CFR+, in practice
- …