2,255 research outputs found
Decision Problems for Nash Equilibria in Stochastic Games
We analyse the computational complexity of finding Nash equilibria in
stochastic multiplayer games with -regular objectives. While the
existence of an equilibrium whose payoff falls into a certain interval may be
undecidable, we single out several decidable restrictions of the problem.
First, restricting the search space to stationary, or pure stationary,
equilibria results in problems that are typically contained in PSPACE and NP,
respectively. Second, we show that the existence of an equilibrium with a
binary payoff (i.e. an equilibrium where each player either wins or loses with
probability 1) is decidable. We also establish that the existence of a Nash
equilibrium with a certain binary payoff entails the existence of an
equilibrium with the same payoff in pure, finite-state strategies.Comment: 22 pages, revised versio
A Lyapunov Optimization Approach to Repeated Stochastic Games
This paper considers a time-varying game with players. Every time slot,
players observe their own random events and then take a control action. The
events and control actions affect the individual utilities earned by each
player. The goal is to maximize a concave function of time average utilities
subject to equilibrium constraints. Specifically, participating players are
provided access to a common source of randomness from which they can optimally
correlate their decisions. The equilibrium constraints incentivize
participation by ensuring that players cannot earn more utility if they choose
not to participate. This form of equilibrium is similar to the notions of Nash
equilibrium and correlated equilibrium, but is simpler to attain. A Lyapunov
method is developed that solves the problem in an online \emph{max-weight}
fashion by selecting actions based on a set of time-varying weights. The
algorithm does not require knowledge of the event probabilities and has
polynomial convergence time. A similar method can be used to compute a standard
correlated equilibrium, albeit with increased complexity.Comment: 13 pages, this version fixes an incorrect statement of the previous
arxiv version (see footnote 1, page 5 in current version for the correction
BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games
We present BL-WoLF, a framework for learnability in repeated zero-sum games
where the cost of learning is measured by the losses the learning agent accrues
(rather than the number of rounds). The game is adversarially chosen from some
family that the learner knows. The opponent knows the game and the learner's
learning strategy. The learner tries to either not accrue losses, or to quickly
learn about the game so as to avoid future losses (this is consistent with the
Win or Learn Fast (WoLF) principle; BL stands for ``bounded loss''). Our
framework allows for both probabilistic and approximate learning. The resultant
notion of {\em BL-WoLF}-learnability can be applied to any class of games, and
allows us to measure the inherent disadvantage to a player that does not know
which game in the class it is in. We present {\em guaranteed
BL-WoLF-learnability} results for families of games with deterministic payoffs
and families of games with stochastic payoffs. We demonstrate that these
families are {\em guaranteed approximately BL-WoLF-learnable} with lower cost.
We then demonstrate families of games (both stochastic and deterministic) that
are not guaranteed BL-WoLF-learnable. We show that those families,
nevertheless, are {\em BL-WoLF-learnable}. To prove these results, we use a key
lemma which we derive
- ā¦