294 research outputs found
Inertial game dynamics and applications to constrained optimization
Aiming to provide a new class of game dynamics with good long-term
rationality properties, we derive a second-order inertial system that builds on
the widely studied "heavy ball with friction" optimization method. By
exploiting a well-known link between the replicator dynamics and the
Shahshahani geometry on the space of mixed strategies, the dynamics are stated
in a Riemannian geometric framework where trajectories are accelerated by the
players' unilateral payoff gradients and they slow down near Nash equilibria.
Surprisingly (and in stark contrast to another second-order variant of the
replicator dynamics), the inertial replicator dynamics are not well-posed; on
the other hand, it is possible to obtain a well-posed system by endowing the
mixed strategy space with a different Hessian-Riemannian (HR) metric structure,
and we characterize those HR geometries that do so. In the single-agent version
of the dynamics (corresponding to constrained optimization over simplex-like
objects), we show that regular maximum points of smooth functions attract all
nearby solution orbits with low initial speed. More generally, we establish an
inertial variant of the so-called "folk theorem" of evolutionary game theory
and we show that strict equilibria are attracting in asymmetric
(multi-population) games - provided of course that the dynamics are well-posed.
A similar asymptotic stability result is obtained for evolutionarily stable
strategies in symmetric (single- population) games.Comment: 30 pages, 4 figures; significantly revised paper structure and added
new material on Euclidean embeddings and evolutionarily stable strategie
Stochastic mirror descent dynamics and their convergence in monotone variational inequalities
We examine a class of stochastic mirror descent dynamics in the context of
monotone variational inequalities (including Nash equilibrium and saddle-point
problems). The dynamics under study are formulated as a stochastic differential
equation driven by a (single-valued) monotone operator and perturbed by a
Brownian motion. The system's controllable parameters are two variable weight
sequences that respectively pre- and post-multiply the driver of the process.
By carefully tuning these parameters, we obtain global convergence in the
ergodic sense, and we estimate the average rate of convergence of the process.
We also establish a large deviations principle showing that individual
trajectories exhibit exponential concentration around this average.Comment: 23 pages; updated proofs in Section 3 and Section
On the robustness of learning in games with stochastically perturbed payoff observations
Motivated by the scarcity of accurate payoff feedback in practical
applications of game theory, we examine a class of learning dynamics where
players adjust their choices based on past payoff observations that are subject
to noise and random disturbances. First, in the single-player case
(corresponding to an agent trying to adapt to an arbitrarily changing
environment), we show that the stochastic dynamics under study lead to no
regret almost surely, irrespective of the noise level in the player's
observations. In the multi-player case, we find that dominated strategies
become extinct and we show that strict Nash equilibria are stochastically
stable and attracting; conversely, if a state is stable or attracting with
positive probability, then it is a Nash equilibrium. Finally, we provide an
averaging principle for 2-player games, and we show that in zero-sum games with
an interior equilibrium, time averages converge to Nash equilibrium for any
noise level.Comment: 36 pages, 4 figure
A stochastic approximation algorithm for stochastic semidefinite programming
Motivated by applications to multi-antenna wireless networks, we propose a
distributed and asynchronous algorithm for stochastic semidefinite programming.
This algorithm is a stochastic approximation of a continous- time matrix
exponential scheme regularized by the addition of an entropy-like term to the
problem's objective function. We show that the resulting algorithm converges
almost surely to an -approximation of the optimal solution
requiring only an unbiased estimate of the gradient of the problem's stochastic
objective. When applied to throughput maximization in wireless multiple-input
and multiple-output (MIMO) systems, the proposed algorithm retains its
convergence properties under a wide array of mobility impediments such as user
update asynchronicities, random delays and/or ergodically changing channels.
Our theoretical analysis is complemented by extensive numerical simulations
which illustrate the robustness and scalability of the proposed method in
realistic network conditions.Comment: 25 pages, 4 figure
Riemannian game dynamics
We study a class of evolutionary game dynamics defined by balancing a gain
determined by the game's payoffs against a cost of motion that captures the
difficulty with which the population moves between states. Costs of motion are
represented by a Riemannian metric, i.e., a state-dependent inner product on
the set of population states. The replicator dynamics and the (Euclidean)
projection dynamics are the archetypal examples of the class we study. Like
these representative dynamics, all Riemannian game dynamics satisfy certain
basic desiderata, including positive correlation and global convergence in
potential games. Moreover, when the underlying Riemannian metric satisfies a
Hessian integrability condition, the resulting dynamics preserve many further
properties of the replicator and projection dynamics. We examine the close
connections between Hessian game dynamics and reinforcement learning in normal
form games, extending and elucidating a well-known link between the replicator
dynamics and exponential reinforcement learning.Comment: 47 pages, 12 figures; added figures and further simplified the
derivation of the dynamic
Transmit without regrets: Online optimization in MIMO-OFDM cognitive radio systems
In this paper, we examine cognitive radio systems that evolve dynamically
over time due to changing user and environmental conditions. To combine the
advantages of orthogonal frequency division multiplexing (OFDM) and
multiple-input, multiple-output (MIMO) technologies, we consider a MIMO-OFDM
cognitive radio network where wireless users with multiple antennas communicate
over several non-interfering frequency bands. As the network's primary users
(PUs) come and go in the system, the communication environment changes
constantly (and, in many cases, randomly). Accordingly, the network's
unlicensed, secondary users (SUs) must adapt their transmit profiles "on the
fly" in order to maximize their data rate in a rapidly evolving environment
over which they have no control. In this dynamic setting, static solution
concepts (such as Nash equilibrium) are no longer relevant, so we focus on
dynamic transmit policies that lead to no regret: specifically, we consider
policies that perform at least as well as (and typically outperform) even the
best fixed transmit profile in hindsight. Drawing on the method of matrix
exponential learning and online mirror descent techniques, we derive a
no-regret transmit policy for the system's SUs which relies only on local
channel state information (CSI). Using this method, the system's SUs are able
to track their individually evolving optimum transmit profiles remarkably well,
even under rapidly (and randomly) changing conditions. Importantly, the
proposed augmented exponential learning (AXL) policy leads to no regret even if
the SUs' channel measurements are subject to arbitrarily large observation
errors (the imperfect CSI case), thus ensuring the method's robustness in the
presence of uncertainties.Comment: 25 pages, 3 figures, to appear in the IEEE Journal on Selected Areas
in Communication
- …