69 research outputs found

    Inertial Stochastic PALM (iSPALM) and Applications in Machine Learning

    Full text link
    Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of stochastic algorithms. While algorithms based on stochastic gradient descent are still used in the majority of applications, recently also stochastic algorithms for minimizing nonsmooth and nonconvex functions were proposed. In this paper, we derive an inertial variant of a stochastic PALM algorithm with variance-reduced gradient estimator, called iSPALM, and prove linear convergence of the algorithm under certain assumptions. Our inertial approach can be seen as generalization of momentum methods widely used to speed up and stabilize optimization algorithms, in particular in machine learning, to nonsmooth problems. Numerical experiments for learning the weights of a so-called proximal neural network and the parameters of Student-t mixture models show that our new algorithm outperforms both stochastic PALM and its deterministic counterparts

    Two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems

    Full text link
    In this paper, we study an algorithm for solving a class of nonconvex and nonsmooth nonseparable optimization problems. Based on proximal alternating linearized minimization (PALM), we propose a new iterative algorithm which combines two-step inertial extrapolation and Bregman distance. By constructing appropriate benefit function, with the help of Kurdyka--{\L}ojasiewicz property we establish the convergence of the whole sequence generated by proposed algorithm. We apply the algorithm to signal recovery, quadratic fractional programming problem and show the effectiveness of proposed algorithm.Comment: 28 pages, 8 figures, 4 tables. arXiv admin note: text overlap with arXiv:2306.0420

    A stochastic two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems

    Full text link
    In this paper, for solving a broad class of large-scale nonconvex and nonsmooth optimization problems, we propose a stochastic two step inertial Bregman proximal alternating linearized minimization (STiBPALM) algorithm with variance-reduced stochastic gradient estimators. And we show that SAGA and SARAH are variance-reduced gradient estimators. Under expectation conditions with the Kurdyka-Lojasiewicz property and some suitable conditions on the parameters, we obtain that the sequence generated by the proposed algorithm converges to a critical point. And the general convergence rate is also provided. Numerical experiments on sparse nonnegative matrix factorization and blind image-deblurring are presented to demonstrate the performance of the proposed algorithm.Comment: arXiv admin note: text overlap with arXiv:2002.12266 by other author

    An Accelerated Block Proximal Framework with Adaptive Momentum for Nonconvex and Nonsmooth Optimization

    Full text link
    We propose an accelerated block proximal linear framework with adaptive momentum (ABPL+^+) for nonconvex and nonsmooth optimization. We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process that evaluates the trade-off between the proximal gradient step and the linear extrapolation step in our algorithm. Furthermore, we extends our algorithm to any scenario involving updating block variables with positive integers, allowing each cycle to randomly shuffle the update order of the variable blocks. Additionally, under mild assumptions, we prove that ABPL+^+ can monotonically decrease the function value without strictly restricting the extrapolation parameters and step size, demonstrates the viability and effectiveness of updating these blocks in a random order, and we also more obviously and intuitively demonstrate that the derivative set of the sequence generated by our algorithm is a critical point set. Moreover, we demonstrate the global convergence as well as the linear and sublinear convergence rates of our algorithm by utilizing the Kurdyka-Lojasiewicz (K{\L}) condition. To enhance the effectiveness and flexibility of our algorithm, we also expand the study to the imprecise version of our algorithm and construct an adaptive extrapolation parameter strategy, which improving its overall performance. We apply our algorithm to multiple non-negative matrix factorization with the β„“0\ell_0 norm, nonnegative tensor decomposition with the β„“0\ell_0 norm, and perform extensive numerical experiments to validate its effectiveness and efficiency
    • …
    corecore