49,057 research outputs found
A stochastic approximation scheme with accelerated convergence properties
Stochastic approximation scheme with accelerated convergence propertie
Accelerated stochastic approximation with state-dependent noise
We consider a class of stochastic smooth convex optimization problems under
rather general assumptions on the noise in the stochastic gradient observation.
As opposed to the classical problem setting in which the variance of noise is
assumed to be uniformly bounded, herein we assume that the variance of
stochastic gradients is related to the "sub-optimality" of the approximate
solutions delivered by the algorithm. Such problems naturally arise in a
variety of applications, in particular, in the well-known generalized linear
regression problem in statistics. However, to the best of our knowledge, none
of the existing stochastic approximation algorithms for solving this class of
problems attain optimality in terms of the dependence on accuracy, problem
parameters, and mini-batch size.
We discuss two non-Euclidean accelerated stochastic approximation
routines--stochastic accelerated gradient descent (SAGD) and stochastic
gradient extrapolation (SGE)--which carry a particular duality relationship. We
show that both SAGD and SGE, under appropriate conditions, achieve the optimal
convergence rate, attaining the optimal iteration and sample complexities
simultaneously. However, corresponding assumptions for the SGE algorithm are
more general; they allow, for instance, for efficient application of the SGE to
statistical estimation problems under heavy tail noises and discontinuous score
functions. We also discuss the application of the SGE to problems satisfying
quadratic growth conditions, and show how it can be used to recover sparse
solutions. Finally, we report on some simulation experiments to illustrate
numerical performance of our proposed algorithms in high-dimensional settings
Lazier Than Lazy Greedy
Is it possible to maximize a monotone submodular function faster than the
widely used lazy greedy algorithm (also known as accelerated greedy), both in
theory and practice? In this paper, we develop the first linear-time algorithm
for maximizing a general monotone submodular function subject to a cardinality
constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can
achieve a approximation guarantee, in expectation, to the
optimum solution in time linear in the size of the data and independent of the
cardinality constraint. We empirically demonstrate the effectiveness of our
algorithm on submodular functions arising in data summarization, including
training large-scale kernel methods, exemplar-based clustering, and sensor
placement. We observe that STOCHASTIC-GREEDY practically achieves the same
utility value as lazy greedy but runs much faster. More surprisingly, we
observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate
the whole fraction of data points even once and still achieves
indistinguishable results compared to lazy greedy.Comment: In Proc. Conference on Artificial Intelligence (AAAI), 201
- …