Stochastic Composition Optimization of Functions without Lipschitz Continuous Gradient

Abstract

In this paper, we study the stochastic optimization of two-level composition of functions without Lipschitz continuous gradient. The smoothness property is generalized by the notion of relative smoothness which provokes the Bregman gradient method. We propose three Stochastic Compositional Bregman Gradient algorithms for the three possible nonsmooth compositional scenarios and provide their sample complexities to achieve an ϵ\epsilon-approximate stationary point. For the smooth of relative smooth composition, the first algorithm requires O(ϵ−2)O(\epsilon^{-2}) calls to the stochastic oracles of the inner function value and gradient as well as the outer function gradient. When both functions are relatively smooth, the second algorithm requires O(ϵ−3)O(\epsilon^{-3}) calls to the inner function stochastic oracle and O(ϵ−2)O(\epsilon^{-2}) calls to the inner and outer function stochastic gradient oracles. We further improve the second algorithm by variance reduction for the setting where just the inner function is smooth. The resulting algorithm requires O(ϵ−5/2)O(\epsilon^{-5/2}) calls to the stochastic inner function value and O(ϵ−3/2)O(\epsilon^{-3/2}) calls to the inner stochastic gradient and O(ϵ−2)O(\epsilon^{-2}) calls to the outer function stochastic gradient. Finally, we numerically evaluate the performance of these algorithms over two examples

    Similar works

    Full text

    thumbnail-image

    Available Versions