10 research outputs found
Finding Approximate Nash Equilibria of Bimatrix Games via Payoff Queries
We study the deterministic and randomized query complexity of finding approximate equilibria in a k × k bimatrix game. We show that the deterministic query complexity of finding an ϵ-Nash equilibrium when ϵ < ½ is Ω(k2), even in zero-one constant-sum games. In combination with previous results [Fearnley et al. 2013], this provides a complete characterization of the deterministic query complexity of approximate Nash equilibria. We also study randomized querying algorithms. We give a randomized algorithm for finding a (3-√5/2 + ϵ)-Nash equilibrium using O(k.log k/ϵ2) payoff queries, which shows that the ½ barrier for deterministic algorithms can be broken by randomization. For well-supported Nash equilibria (WSNE), we first give a randomized algorithm for finding an ϵ-WSNE of a zero-sum bimatrix game using O(k.log k/ϵ4) payoff queries, and we then use this to obtain a randomized algorithm for finding a (⅔ + ϵ)-WSNE in a general bimatrix game using O(k.log k/ϵ4) payoff queries. Finally, we initiate the study of lower bounds against randomized algorithms in the context of bimatrix games, by showing that randomized algorithms require Ω(k2) payoff queries in order to find an ϵ-Nash equilibrium with ϵ < 1/4k, even in zero-one constant-sum games. In particular, this rules out query-efficient randomized algorithms for finding exact Nash equilibria
Distributed Methods for Computing Approximate Equilibria
We present a new, distributed method to compute approximate Nash equilibria
in bimatrix games. In contrast to previous approaches that analyze the two
payoff matrices at the same time (for example, by solving a single LP that
combines the two players payoffs), our algorithm first solves two independent
LPs, each of which is derived from one of the two payoff matrices, and then
compute approximate Nash equilibria using only limited communication between
the players.
Our method has several applications for improved bounds for efficient
computations of approximate Nash equilibria in bimatrix games. First, it yields
a best polynomial-time algorithm for computing \emph{approximate well-supported
Nash equilibria (WSNE)}, which guarantees to find a 0.6528-WSNE in polynomial
time. Furthermore, since our algorithm solves the two LPs separately, it can be
used to improve upon the best known algorithms in the limited communication
setting: the algorithm can be implemented to obtain a randomized
expected-polynomial-time algorithm that uses poly-logarithmic communication and
finds a 0.6528-WSNE. The algorithm can also be carried out to beat the best
known bound in the query complexity setting, requiring payoff
queries to compute a 0.6528-WSNE. Finally, our approach can also be adapted to
provide the best known communication efficient algorithm for computing
\emph{approximate Nash equilibria}: it uses poly-logarithmic communication to
find a 0.382-approximate Nash equilibrium
Query Complexity of Approximate Equilibria in Anonymous Games
We study the computation of equilibria of anonymous games, via algorithms
that may proceed via a sequence of adaptive queries to the game's payoff
function, assumed to be unknown initially. The general topic we consider is
\emph{query complexity}, that is, how many queries are necessary or sufficient
to compute an exact or approximate Nash equilibrium.
We show that exact equilibria cannot be found via query-efficient algorithms.
We also give an example of a 2-strategy, 3-player anonymous game that does not
have any exact Nash equilibrium in rational numbers. However, more positive
query-complexity bounds are attainable if either further symmetries of the
utility functions are assumed or we focus on approximate equilibria. We
investigate four sub-classes of anonymous games previously considered by
\cite{bfh09, dp14}.
Our main result is a new randomized query-efficient algorithm that finds a
-approximate Nash equilibrium querying
payoffs and runs in time . This improves on the running
time of pre-existing algorithms for approximate equilibria of anonymous games,
and is the first one to obtain an inverse polynomial approximation in
poly-time. We also show how this can be utilized as an efficient
polynomial-time approximation scheme (PTAS). Furthermore, we prove that
payoffs must be queried in order to find any
-well-supported Nash equilibrium, even by randomized algorithms
Logarithmic Query Complexity for Approximate Nash Computation in Large Games
We investigate the problem of equilibrium computation for “large” n-player games. Large
games have a Lipschitz-type property that no single player’s utility is greatly affected by any
other individual player’s actions. In this paper, we mostly focus on the case where any change of
strategy by a player causes other players’ payoffs to change by at most 1
n
. We study algorithms
having query access to the game’s payoff function, aiming to find ε-Nash equilibria. We seek
algorithms that obtain ε as small as possible, in time polynomial in n.
Our main result is a randomised algorithm that achieves ε approaching 1
8
for 2-strategy games
in a completely uncoupled setting, where each player observes her own payoff to a query, and
adjusts her behaviour independently of other players’ payoffs/actions. O(log n) rounds/queries
are required. We also show how to obtain a slight improvement over 1
8
, by introducing a small
amount of communication between the players.
Finally, we give extension of our results to large games with more than two strategies per
player, and alternative largeness parameters
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
Model-based reinforcement learning (RL), which finds an optimal policy using
an empirical model, has long been recognized as one of the corner stones of RL.
It is especially suitable for multi-agent RL (MARL), as it naturally decouples
the learning and the planning phases, and avoids the non-stationarity problem
when all agents are improving their policies simultaneously using samples.
Though intuitive, easy-to-implement, and widely-used, the sample complexity of
model-based MARL algorithms has not been fully investigated. In this paper, our
goal is to address the fundamental question about its sample complexity. We
study arguably the most basic MARL setting: two-player discounted zero-sum
Markov games, given only access to a generative model. We show that model-based
MARL achieves a sample complexity of for finding the Nash equilibrium (NE)
value up to some error, and the -NE policies with a smooth
planning oracle, where is the discount factor, and denote the
state space, and the action spaces for the two agents. We further show that
such a sample bound is minimax-optimal (up to logarithmic factors) if the
algorithm is reward-agnostic, where the algorithm queries state transition
samples without reward knowledge, by establishing a matching lower bound. This
is in contrast to the usual reward-aware setting, with a
lower bound, where
this model-based approach is near-optimal with only a gap on the
dependence. Our results not only demonstrate the sample-efficiency of this
basic model-based approach in MARL, but also elaborate on the fundamental
tradeoff between its power (easily handling the more challenging
reward-agnostic case) and limitation (less adaptive and suboptimal in
), particularly arises in the multi-agent context