30,359 research outputs found
Efficient Optimization of Loops and Limits with Randomized Telescoping Sums
We consider optimization problems in which the objective requires an inner
loop with many steps or is the limit of a sequence of increasingly costly
approximations. Meta-learning, training recurrent neural networks, and
optimization of the solutions to differential equations are all examples of
optimization problems with this character. In such problems, it can be
expensive to compute the objective function value and its gradient, but
truncating the loop or using less accurate approximations can induce biases
that damage the overall solution. We propose randomized telescope (RT) gradient
estimators, which represent the objective as the sum of a telescoping series
and sample linear combinations of terms to provide cheap unbiased gradient
estimates. We identify conditions under which RT estimators achieve
optimization convergence rates independent of the length of the loop or the
required accuracy of the approximation. We also derive a method for tuning RT
estimators online to maximize a lower bound on the expected decrease in loss
per unit of computation. We evaluate our adaptive RT estimators on a range of
applications including meta-optimization of learning rates, variational
inference of ODE parameters, and training an LSTM to model long sequences
Distributed Learning for Stochastic Generalized Nash Equilibrium Problems
This work examines a stochastic formulation of the generalized Nash
equilibrium problem (GNEP) where agents are subject to randomness in the
environment of unknown statistical distribution. We focus on fully-distributed
online learning by agents and employ penalized individual cost functions to
deal with coupled constraints. Three stochastic gradient strategies are
developed with constant step-sizes. We allow the agents to use heterogeneous
step-sizes and show that the penalty solution is able to approach the Nash
equilibrium in a stable manner within , for small step-size
value and sufficiently large penalty parameters. The operation
of the algorithm is illustrated by considering the network Cournot competition
problem
- …