61 research outputs found
On the robustness of learning in games with stochastically perturbed payoff observations
Motivated by the scarcity of accurate payoff feedback in practical
applications of game theory, we examine a class of learning dynamics where
players adjust their choices based on past payoff observations that are subject
to noise and random disturbances. First, in the single-player case
(corresponding to an agent trying to adapt to an arbitrarily changing
environment), we show that the stochastic dynamics under study lead to no
regret almost surely, irrespective of the noise level in the player's
observations. In the multi-player case, we find that dominated strategies
become extinct and we show that strict Nash equilibria are stochastically
stable and attracting; conversely, if a state is stable or attracting with
positive probability, then it is a Nash equilibrium. Finally, we provide an
averaging principle for 2-player games, and we show that in zero-sum games with
an interior equilibrium, time averages converge to Nash equilibrium for any
noise level.Comment: 36 pages, 4 figure
Evolutionary Game Theory Squared: Evolving Agents in Endogenously Evolving Zero-Sum Games
The predominant paradigm in evolutionary game theory and more generally
online learning in games is based on a clear distinction between a population
of dynamic agents that interact given a fixed, static game. In this paper, we
move away from the artificial divide between dynamic agents and static games,
to introduce and analyze a large class of competitive settings where both the
agents and the games they play evolve strategically over time. We focus on
arguably the most archetypal game-theoretic setting -- zero-sum games (as well
as network generalizations) -- and the most studied evolutionary learning
dynamic -- replicator, the continuous-time analogue of multiplicative weights.
Populations of agents compete against each other in a zero-sum competition that
itself evolves adversarially to the current population mixture. Remarkably,
despite the chaotic coevolution of agents and games, we prove that the system
exhibits a number of regularities. First, the system has conservation laws of
an information-theoretic flavor that couple the behavior of all agents and
games. Secondly, the system is Poincar\'{e} recurrent, with effectively all
possible initializations of agents and games lying on recurrent orbits that
come arbitrarily close to their initial conditions infinitely often. Thirdly,
the time-average agent behavior and utility converge to the Nash equilibrium
values of the time-average game. Finally, we provide a polynomial time
algorithm to efficiently predict this time-average behavior for any such
coevolving network game.Comment: To appear in AAAI 202
Is Learning in Games Good for the Learners?
We consider a number of questions related to tradeoffs between reward and
regret in repeated gameplay between two agents. To facilitate this, we
introduce a notion of which allows for
asymmetric regret constraints, and yields polytopes of feasible values for each
agent and pair of regret constraints, where we show that any such equilibrium
is reachable by a pair of algorithms which maintain their regret guarantees
against arbitrary opponents. As a central example, we highlight the case one
agent is no-swap and the other's regret is unconstrained. We show that this
captures an extension of equilibria with a matching
optimal value, and that there exists a wide class of games where a player can
significantly increase their utility by deviating from a no-swap-regret
algorithm against a no-swap learner (in fact, almost any game without pure Nash
equilibria is of this form). Additionally, we make use of generalized
equilibria to consider tradeoffs in terms of the opponent's algorithm choice.
We give a tight characterization for the maximal reward obtainable against
no-regret learner, yet we also show a class of games in which
this is bounded away from the value obtainable against the class of common
"mean-based" no-regret algorithms. Finally, we consider the question of
learning reward-optimal strategies via repeated play with a no-regret agent
when the game is initially unknown. Again we show tradeoffs depending on the
opponent's learning algorithm: the Stackelberg strategy is learnable in
exponential time with any no-regret agent (and in polynomial time with any
no--regret agent) for any game where it is learnable via
queries, and there are games where it is learnable in polynomial time against
any no-swap-regret agent but requires exponential time against a mean-based
no-regret agent.Comment: 22 page
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
- …