1,041 research outputs found
Joint strategy fictitious play with inertia for potential games
We consider multi-player repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these ldquolarge-scalerdquo games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players in large-scale games need to make their decisions using algorithms that accommodate limitations in information gathering and processing. This disqualifies some of the well known decision making models such as ldquoFictitious Playrdquo (FP), in which each player must monitor the individual actions of every other player and must optimize over a high dimensional probability space. We will show that Joint Strategy Fictitious Play (JSFP), a close variant of FP, alleviates both the informational and computational burden of FP. Furthermore, we introduce JSFP with inertia, i.e., a probabilistic reluctance to change strategies, and establish the convergence to a pure Nash equilibrium in all generalized ordinal potential games in both cases of averaged or exponentially discounted historical data. We illustrate JSFP with inertia on the specific class of congestion games, a subset of generalized ordinal potential games. In particular, we illustrate the main results on a distributed traffic routing problem and derive tolling procedures that can lead to optimized total traffic congestion
Payoff Performance of Fictitious Play
We investigate how well continuous-time fictitious play in two-player games
performs in terms of average payoff, particularly compared to Nash equilibrium
payoff. We show that in many games, fictitious play outperforms Nash
equilibrium on average or even at all times, and moreover that any game is
linearly equivalent to one in which this is the case. Conversely, we provide
conditions under which Nash equilibrium payoff dominates fictitious play
payoff. A key step in our analysis is to show that fictitious play dynamics
asymptotically converges the set of coarse correlated equilibria (a fact which
is implicit in the literature).Comment: 16 pages, 4 figure
No-regret Dynamics and Fictitious Play
Potential based no-regret dynamics are shown to be related to fictitious
play. Roughly, these are epsilon-best reply dynamics where epsilon is the
maximal regret, which vanishes with time. This allows for alternative and
sometimes much shorter proofs of known results on convergence of no-regret
dynamics to the set of Nash equilibria
Learning Equilibria with Partial Information in Decentralized Wireless Networks
In this article, a survey of several important equilibrium concepts for
decentralized networks is presented. The term decentralized is used here to
refer to scenarios where decisions (e.g., choosing a power allocation policy)
are taken autonomously by devices interacting with each other (e.g., through
mutual interference). The iterative long-term interaction is characterized by
stable points of the wireless network called equilibria. The interest in these
equilibria stems from the relevance of network stability and the fact that they
can be achieved by letting radio devices to repeatedly interact over time. To
achieve these equilibria, several learning techniques, namely, the best
response dynamics, fictitious play, smoothed fictitious play, reinforcement
learning algorithms, and regret matching, are discussed in terms of information
requirements and convergence properties. Most of the notions introduced here,
for both equilibria and learning schemes, are illustrated by a simple case
study, namely, an interference channel with two transmitter-receiver pairs.Comment: 16 pages, 5 figures, 1 table. To appear in IEEE Communication
Magazine, special Issue on Game Theor
A Unified View of Large-scale Zero-sum Equilibrium Computation
The task of computing approximate Nash equilibria in large zero-sum
extensive-form games has received a tremendous amount of attention due mainly
to the Annual Computer Poker Competition. Immediately after its inception, two
competing and seemingly different approaches emerged---one an application of
no-regret online learning, the other a sophisticated gradient method applied to
a convex-concave saddle-point formulation. Since then, both approaches have
grown in relative isolation with advancements on one side not effecting the
other. In this paper, we rectify this by dissecting and, in a sense, unify the
two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio
- âŠ