10,866 research outputs found
The Limitation and Practical Acceleration of Stochastic Gradient Algorithms in Inverse Problems.
In this work we investigate the practicability of stochastic gradient descent and recently introduced variants with variance-reduction techniques in imaging inverse problems, such as space-varying image deblurring. Such algorithms have been shown in machine learning literature to have optimal complexities in theory, and provide great improvement empirically over the full gradient methods. Surprisingly, in some tasks such as image deblurring, many of such methods fail to converge faster than the accelerated full gradient method (FISTA), even in terms of epoch counts. We investigate this phenomenon and propose a theory-inspired mechanism to characterize whether a given inverse problem should be preferred to be solved by stochastic optimization technique with a known sampling pattern. Furthermore, to overcome another key bottleneck of stochastic optimization which is the heavy computation of proximal operators while maintaining fast convergence, we propose an accelerated primal-dual SGD algorithm and demonstrate the effectiveness of our approach in image deblurring experiments.acceptedVersionPeer reviewe
From Proximal Point Method to Nesterov's Acceleration
The proximal point method (PPM) is a fundamental method in optimization that
is often used as a building block for fast optimization algorithms. In this
work, building on a recent work by Defazio (2019), we provide a complete
understanding of Nesterov's accelerated gradient method (AGM) by establishing
quantitative and analytical connections between PPM and AGM. The main
observation in this paper is that AGM is in fact equal to a simple
approximation of PPM, which results in an elementary derivation of the
mysterious updates of AGM as well as its step sizes. This connection also leads
to a conceptually simple analysis of AGM based on the standard analysis of PPM.
This view naturally extends to the strongly convex case and also motivates
other accelerated methods for practically relevant settings.Comment: 14 pages; Section 4 updated; Remark 5 added; comments would be
appreciated
The Practicality of Stochastic Optimization in Imaging Inverse Problems
In this work we investigate the practicality of stochastic gradient descent
and recently introduced variants with variance-reduction techniques in imaging
inverse problems. Such algorithms have been shown in the machine learning
literature to have optimal complexities in theory, and provide great
improvement empirically over the deterministic gradient methods. Surprisingly,
in some tasks such as image deblurring, many of such methods fail to converge
faster than the accelerated deterministic gradient methods, even in terms of
epoch counts. We investigate this phenomenon and propose a theory-inspired
mechanism for the practitioners to efficiently characterize whether it is
beneficial for an inverse problem to be solved by stochastic optimization
techniques or not. Using standard tools in numerical linear algebra, we derive
conditions on the spectral structure of the inverse problem for being a
suitable application of stochastic gradient methods. Particularly, we show
that, for an imaging inverse problem, if and only if its Hessain matrix has a
fast-decaying eigenspectrum, then the stochastic gradient methods can be more
advantageous than deterministic methods for solving such a problem. Our results
also provide guidance on choosing appropriately the partition minibatch
schemes, showing that a good minibatch scheme typically has relatively low
correlation within each of the minibatches. Finally, we propose an accelerated
primal-dual SGD algorithm in order to tackle another key bottleneck of
stochastic optimization which is the heavy computation of proximal operators.
The proposed method has fast convergence rate in practice, and is able to
efficiently handle non-smooth regularization terms which are coupled with
linear operators
- …