2,317 research outputs found
Markov Perfect Industry Dynamics with Many Firms
We propose an approximation method for analyzing Ericson and Pakes (1995)-style dynamic models of imperfect competition. We develop a simple algorithm for computing an ``oblivious equilibrium,'' in which each firm is assumed to make decisions based only on its own state and knowledge of the long run average industry state, but where firms ignore current information about competitors' states. We prove that, as the market becomes large, if the equilibrium distribution of firm states obeys a certain ``light-tail'' condition, then oblivious equilibria closely approximate Markov perfect equilibria. We develop bounds that can be computed to assess the accuracy of the approximation for any given applied problem. Through computational experiments, we find that the method often generates useful approximations for industries with hundreds of firms and in some cases even tens of firms.
Mean Field Equilibrium in Dynamic Games with Complementarities
We study a class of stochastic dynamic games that exhibit strategic
complementarities between players; formally, in the games we consider, the
payoff of a player has increasing differences between her own state and the
empirical distribution of the states of other players. Such games can be used
to model a diverse set of applications, including network security models,
recommender systems, and dynamic search in markets. Stochastic games are
generally difficult to analyze, and these difficulties are only exacerbated
when the number of players is large (as might be the case in the preceding
examples).
We consider an approximation methodology called mean field equilibrium to
study these games. In such an equilibrium, each player reacts to only the long
run average state of other players. We find necessary conditions for the
existence of a mean field equilibrium in such games. Furthermore, as a simple
consequence of this existence theorem, we obtain several natural monotonicity
properties. We show that there exist a "largest" and a "smallest" equilibrium
among all those where the equilibrium strategy used by a player is
nondecreasing, and we also show that players converge to each of these
equilibria via natural myopic learning dynamics; as we argue, these dynamics
are more reasonable than the standard best response dynamics. We also provide
sensitivity results, where we quantify how the equilibria of such games move in
response to changes in parameters of the game (e.g., the introduction of
incentives to players).Comment: 56 pages, 5 figure
The Computational Power of Optimization in Online Learning
We consider the fundamental problem of prediction with expert advice where
the experts are "optimizable": there is a black-box optimization oracle that
can be used to compute, in constant time, the leading expert in retrospect at
any point in time. In this setting, we give a novel online algorithm that
attains vanishing regret with respect to experts in total
computation time. We also give a lower bound showing
that this running time cannot be improved (up to log factors) in the oracle
model, thereby exhibiting a quadratic speedup as compared to the standard,
oracle-free setting where the required time for vanishing regret is
. These results demonstrate an exponential gap between
the power of optimization in online learning and its power in statistical
learning: in the latter, an optimization oracle---i.e., an efficient empirical
risk minimizer---allows to learn a finite hypothesis class of size in time
. We also study the implications of our results to learning in
repeated zero-sum games, in a setting where the players have access to oracles
that compute, in constant time, their best-response to any mixed strategy of
their opponent. We show that the runtime required for approximating the minimax
value of the game in this setting is , yielding
again a quadratic improvement upon the oracle-free setting, where
is known to be tight
Dynamic Multi-Arm Bandit Game Based Multi-Agents Spectrum Sharing Strategy Design
For a wireless avionics communication system, a Multi-arm bandit game is
mathematically formulated, which includes channel states, strategies, and
rewards. The simple case includes only two agents sharing the spectrum which is
fully studied in terms of maximizing the cumulative reward over a finite time
horizon. An Upper Confidence Bound (UCB) algorithm is used to achieve the
optimal solutions for the stochastic Multi-Arm Bandit (MAB) problem. Also, the
MAB problem can also be solved from the Markov game framework perspective.
Meanwhile, Thompson Sampling (TS) is also used as benchmark to evaluate the
proposed approach performance. Numerical results are also provided regarding
minimizing the expectation of the regret and choosing the best parameter for
the upper confidence bound
- …