1,483 research outputs found
Mean Field Equilibrium in Dynamic Games with Complementarities
We study a class of stochastic dynamic games that exhibit strategic
complementarities between players; formally, in the games we consider, the
payoff of a player has increasing differences between her own state and the
empirical distribution of the states of other players. Such games can be used
to model a diverse set of applications, including network security models,
recommender systems, and dynamic search in markets. Stochastic games are
generally difficult to analyze, and these difficulties are only exacerbated
when the number of players is large (as might be the case in the preceding
examples).
We consider an approximation methodology called mean field equilibrium to
study these games. In such an equilibrium, each player reacts to only the long
run average state of other players. We find necessary conditions for the
existence of a mean field equilibrium in such games. Furthermore, as a simple
consequence of this existence theorem, we obtain several natural monotonicity
properties. We show that there exist a "largest" and a "smallest" equilibrium
among all those where the equilibrium strategy used by a player is
nondecreasing, and we also show that players converge to each of these
equilibria via natural myopic learning dynamics; as we argue, these dynamics
are more reasonable than the standard best response dynamics. We also provide
sensitivity results, where we quantify how the equilibria of such games move in
response to changes in parameters of the game (e.g., the introduction of
incentives to players).Comment: 56 pages, 5 figure
Tropically convex constraint satisfaction
A semilinear relation S is max-closed if it is preserved by taking the
componentwise maximum. The constraint satisfaction problem for max-closed
semilinear constraints is at least as hard as determining the winner in Mean
Payoff Games, a notorious problem of open computational complexity. Mean Payoff
Games are known to be in the intersection of NP and co-NP, which is not known
for max-closed semilinear constraints. Semilinear relations that are max-closed
and additionally closed under translations have been called tropically convex
in the literature. One of our main results is a new duality for open tropically
convex relations, which puts the CSP for tropically convex semilinaer
constraints in general into NP intersected co-NP. This extends the
corresponding complexity result for scheduling under and-or precedence
constraints, or equivalently the max-atoms problem. To this end, we present a
characterization of max-closed semilinear relations in terms of syntactically
restricted first-order logic, and another characterization in terms of a finite
set of relations L that allow primitive positive definitions of all other
relations in the class. We also present a subclass of max-closed constraints
where the CSP is in P; this class generalizes the class of max-closed
constraints over finite domains, and the feasibility problem for max-closed
linear inequalities. Finally, we show that the class of max-closed semilinear
constraints is maximal in the sense that as soon as a single relation that is
not max-closed is added to L, the CSP becomes NP-hard.Comment: 29 pages, 2 figure
Tropical polyhedra are equivalent to mean payoff games
We show that several decision problems originating from max-plus or tropical
convexity are equivalent to zero-sum two player game problems. In particular,
we set up an equivalence between the external representation of tropical convex
sets and zero-sum stochastic games, in which tropical polyhedra correspond to
deterministic games with finite action spaces. Then, we show that the winning
initial positions can be determined from the associated tropical polyhedron. We
obtain as a corollary a game theoretical proof of the fact that the tropical
rank of a matrix, defined as the maximal size of a submatrix for which the
optimal assignment problem has a unique solution, coincides with the maximal
number of rows (or columns) of the matrix which are linearly independent in the
tropical sense. Our proofs rely on techniques from non-linear Perron-Frobenius
theory.Comment: 28 pages, 5 figures; v2: updated references, added background
materials and illustrations; v3: minor improvements, references update
Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial
Recent results of Ye and Hansen, Miltersen and Zwick show that policy
iteration for one or two player (perfect information) zero-sum stochastic
games, restricted to instances with a fixed discount rate, is strongly
polynomial. We show that policy iteration for mean-payoff zero-sum stochastic
games is also strongly polynomial when restricted to instances with bounded
first mean return time to a given state. The proof is based on methods of
nonlinear Perron-Frobenius theory, allowing us to reduce the mean-payoff
problem to a discounted problem with state dependent discount rate. Our
analysis also shows that policy iteration remains strongly polynomial for
discounted problems in which the discount rate can be state dependent (and even
negative) at certain states, provided that the spectral radii of the
nonnegative matrices associated to all strategies are bounded from above by a
fixed constant strictly less than 1.Comment: 17 page
Solving generic nonarchimedean semidefinite programs using stochastic game algorithms
A general issue in computational optimization is to develop combinatorial
algorithms for semidefinite programming. We address this issue when the base
field is nonarchimedean. We provide a solution for a class of semidefinite
feasibility problems given by generic matrices. Our approach is based on
tropical geometry. It relies on tropical spectrahedra, which are defined as the
images by the valuation of nonarchimedean spectrahedra. We establish a
correspondence between generic tropical spectrahedra and zero-sum stochastic
games with perfect information. The latter have been well studied in
algorithmic game theory. This allows us to solve nonarchimedean semidefinite
feasibility problems using algorithms for stochastic games. These algorithms
are of a combinatorial nature and work for large instances.Comment: v1: 25 pages, 4 figures; v2: 27 pages, 4 figures, minor revisions +
benchmarks added; v3: 30 pages, 6 figures, generalization to non-Metzler sign
patterns + some results have been replaced by references to the companion
work arXiv:1610.0674
A Unified View of Large-scale Zero-sum Equilibrium Computation
The task of computing approximate Nash equilibria in large zero-sum
extensive-form games has received a tremendous amount of attention due mainly
to the Annual Computer Poker Competition. Immediately after its inception, two
competing and seemingly different approaches emerged---one an application of
no-regret online learning, the other a sophisticated gradient method applied to
a convex-concave saddle-point formulation. Since then, both approaches have
grown in relative isolation with advancements on one side not effecting the
other. In this paper, we rectify this by dissecting and, in a sense, unify the
two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio
Distributed stochastic optimization via matrix exponential learning
In this paper, we investigate a distributed learning scheme for a broad class
of stochastic optimization problems and games that arise in signal processing
and wireless communications. The proposed algorithm relies on the method of
matrix exponential learning (MXL) and only requires locally computable gradient
observations that are possibly imperfect and/or obsolete. To analyze it, we
introduce the notion of a stable Nash equilibrium and we show that the
algorithm is globally convergent to such equilibria - or locally convergent
when an equilibrium is only locally stable. We also derive an explicit linear
bound for the algorithm's convergence speed, which remains valid under
measurement errors and uncertainty of arbitrarily high variance. To validate
our theoretical analysis, we test the algorithm in realistic
multi-carrier/multiple-antenna wireless scenarios where several users seek to
maximize their energy efficiency. Our results show that learning allows users
to attain a net increase between 100% and 500% in energy efficiency, even under
very high uncertainty.Comment: 31 pages, 3 figure
- …