859 research outputs found
Asymptotic Bias of Stochastic Gradient Search
The asymptotic behavior of the stochastic gradient algorithm with a biased
gradient estimator is analyzed. Relying on arguments based on the dynamic
system theory (chain-recurrence) and the differential geometry (Yomdin theorem
and Lojasiewicz inequality), tight bounds on the asymptotic bias of the
iterates generated by such an algorithm are derived. The obtained results hold
under mild conditions and cover a broad class of high-dimensional nonlinear
algorithms. Using these results, the asymptotic properties of the
policy-gradient (reinforcement) learning and adaptive population Monte Carlo
sampling are studied. Relying on the same results, the asymptotic behavior of
the recursive maximum split-likelihood estimation in hidden Markov models is
analyzed, too.Comment: arXiv admin note: text overlap with arXiv:0907.102
Distributed Learning Policies for Power Allocation in Multiple Access Channels
We analyze the problem of distributed power allocation for orthogonal
multiple access channels by considering a continuous non-cooperative game whose
strategy space represents the users' distribution of transmission power over
the network's channels. When the channels are static, we find that this game
admits an exact potential function and this allows us to show that it has a
unique equilibrium almost surely. Furthermore, using the game's potential
property, we derive a modified version of the replicator dynamics of
evolutionary game theory which applies to this continuous game, and we show
that if the network's users employ a distributed learning scheme based on these
dynamics, then they converge to equilibrium exponentially quickly. On the other
hand, a major challenge occurs if the channels do not remain static but
fluctuate stochastically over time, following a stationary ergodic process. In
that case, the associated ergodic game still admits a unique equilibrium, but
the learning analysis becomes much more complicated because the replicator
dynamics are no longer deterministic. Nonetheless, by employing results from
the theory of stochastic approximation, we show that users still converge to
the game's unique equilibrium.
Our analysis hinges on a game-theoretical result which is of independent
interest: in finite player games which admit a (possibly nonlinear) convex
potential function, the replicator dynamics (suitably modified to account for
nonlinear payoffs) converge to an eps-neighborhood of an equilibrium at time of
order O(log(1/eps)).Comment: 11 pages, 8 figures. Revised manuscript structure and added more
material and figures for the case of stochastically fluctuating channels.
This version will appear in the IEEE Journal on Selected Areas in
Communication, Special Issue on Game Theory in Wireless Communication
Convergence and Convergence Rate of Stochastic Gradient Search in the Case of Multiple and Non-Isolated Extrema
The asymptotic behavior of stochastic gradient algorithms is studied. Relying
on results from differential geometry (Lojasiewicz gradient inequality), the
single limit-point convergence of the algorithm iterates is demonstrated and
relatively tight bounds on the convergence rate are derived. In sharp contrast
to the existing asymptotic results, the new results presented here allow the
objective function to have multiple and non-isolated minima. The new results
also offer new insights into the asymptotic properties of several classes of
recursive algorithms which are routinely used in engineering, statistics,
machine learning and operations research
- …