Search CORE

25 research outputs found

Stochastic Methods for Composite and Weakly Convex Optimization Problems

Author: Duchi John
Ruan Feng
Publication venue
Publication date: 21/09/2018
Field of study

We consider minimization of stochastic functionals that are compositions of a (potentially) non-smooth convex function

h

and smooth function

c

and, more generally, stochastic weakly-convex functionals. We develop a family of stochastic methods---including a stochastic prox-linear algorithm and a stochastic (generalized) sub-gradient procedure---and prove that, under mild technical conditions, each converges to first-order stationary points of the stochastic objective. We provide experiments further investigating our methods on non-smooth phase retrieval problems; the experiments indicate the practical effectiveness of the procedures

arXiv.org e-Print Archive

Minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity

Author: Gratton S.
Simon E.
Toint Ph. L.
Publication venue
Publication date: 27/02/2019
Field of study

An adaptive regularization algorithm using inexact function and derivatives evaluations is proposed for the solution of composite nonsmooth nonconvex optimization. It is shown that this algorithm needs at most

O(|\log(\epsilon)|\,\epsilon^{-2})

evaluations of the problem's functions and their derivatives for finding an

\epsilon

-approximate first-order stationary point. This complexity bound therefore generalizes that provided by [Bellavia, Gurioli, Morini and Toint, 2018] for inexact methods for smooth nonconvex problems, and is within a factor

|\log(\epsilon)|

of the optimal bound known for smooth and nonsmooth nonconvex minimization with exact evaluations. A practically more restrictive variant of the algorithm with worst-case complexity

O(|\log(\epsilon)|+\epsilon^{-2})

is also presented.Comment: 19 page

arXiv.org e-Print Archive

Convergence of a Stochastic Subgradient Method with Averaging for Nonsmooth Nonconvex Constrained Optimization

Author: Ruszczynski Andrzej
Publication venue
Publication date: 16/12/2019
Field of study

We prove convergence of a single time-scale stochastic subgradient method with subgradient averaging for constrained problems with a nonsmooth and nonconvex objective function having the property of generalized differentiability. As a tool of our analysis, we also prove a chain rule on a path for such functions

arXiv.org e-Print Archive

Automatic Registration and Clustering of Time Series

Author: Michailidis George
Weylandt Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/02/2021
Field of study

Clustering of time series data exhibits a number of challenges not present in other settings, notably the problem of registration (alignment) of observed signals. Typical approaches include pre-registration to a user-specified template or time warping approaches which attempt to optimally align series with a minimum of distortion. For many signals obtained from recording or sensing devices, these methods may be unsuitable as a template signal is not available for pre-registration, while the distortion of warping approaches may obscure meaningful temporal information. We propose a new method for automatic time series alignment within a clustering problem. Our approach, Temporal Registration using Optimal Unitary Transformations (TROUT), is based on a novel dissimilarity measure between time series that is easy to compute and automatically identifies optimal alignment between pairs of time series. By embedding our new measure in a optimization formulation, we retain well-known advantages of computational and statistical performance. We provide an efficient algorithm for TROUT-based clustering and demonstrate its superior performance over a range of competitors.Comment: To appear in ICASSP 202

arXiv.org e-Print Archive

Zero Order Stochastic Weakly Convex Composite Optimization

Author: Kungurtsev V.
Rinaldi F.
Publication venue
Publication date: 19/02/2020
Field of study

In this paper we consider stochastic weakly convex composite problems, however without the existence of a stochastic subgradient oracle. We present a derivative free algorithm that uses a two point approximation for computing a gradient estimate of the smoothed function. We prove convergence at a similar rate as state of the art methods, however with a larger constant, and report some numerical results showing the effectiveness of the approach

arXiv.org e-Print Archive

ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs

Author: Ryu Ernest K.
Yin Wotao
Yuan Kun
Publication venue
Publication date: 05/06/2019
Field of study

Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood. In this work, we analyze last-iterate convergence of simultaneous gradient descent (simGD) and its variants under the assumption of convex-concavity, guided by a continuous-time analysis with differential equations. First, we show that simGD, as is, converges with stochastic sub-gradients under strict convexity in the primal variable. Second, we generalize optimistic simGD to accommodate an optimism rate separate from the learning rate and show its convergence with full gradients. Finally, we present anchored simGD, a new method, and show convergence with stochastic subgradients

arXiv.org e-Print Archive

Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems

Author: Michailidis George
Nazari Parvin
Tarzanagh Davoud Ataee
Publication venue
Publication date: 24/05/2020
Field of study

In this paper, we design and analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) stochastic optimization problems. Adaptive methods that use exponential moving averages of past gradients to update search directions and learning rates have recently attracted a lot of attention for solving optimization problems that arise in machine learning. Nevertheless, their convergence analysis almost exclusively requires smoothness and/or convexity of the objective function. In contrast, we establish non-asymptotic rates of convergence of first and zeroth-order adaptive methods and their proximal variants for a reasonably broad class of nonsmooth \& nonconvex optimization problems. Experimental results indicate how the proposed algorithms empirically outperform stochastic gradient descent and its zeroth-order variant for solving such optimization problems

arXiv.org e-Print Archive

Learning Latent Features with Pairwise Penalties in Low-Rank Matrix Completion

Author: Chi Yuejie
Ji Kaiyi
Tan Jian
Xu Jinfeng
Publication venue
Publication date: 26/01/2020
Field of study

Low-rank matrix completion has achieved great success in many real-world data applications. A matrix factorization model that learns latent features is usually employed and, to improve prediction performance, the similarities between latent variables can be exploited by pairwise learning using the graph regularized matrix factorization (GRMF) method. However, existing GRMF approaches often use the squared loss to measure the pairwise differences, which may be overly influenced by dissimilar pairs and lead to inferior prediction. To fully empower pairwise learning for matrix completion, we propose a general optimization framework that allows a rich class of (non-)convex pairwise penalty functions. A new and efficient algorithm is developed to solve the proposed optimization problem, with a theoretical convergence guarantee under mild assumptions. In an important situation where the latent variables form a small number of subgroups, its statistical guarantee is also fully considered. In particular, we theoretically characterize the performance of the complexity-regularized maximum likelihood estimator, as a special case of our framework, which is shown to have smaller errors when compared to the standard matrix completion framework without pairwise penalties. We conduct extensive experiments on both synthetic and real datasets to demonstrate the superior performance of this general framework

arXiv.org e-Print Archive

A Manifold Proximal Linear Method for Sparse Spectral Clustering with Application to Single-Cell RNA Sequencing Data Analysis

Author: Chen Shixiang
Liu Bingyuan
Ma Shiqian
Wang Zhongruo
Xue Lingzhou
Zhao Hongyu
Publication venue
Publication date: 30/10/2020
Field of study

Spectral clustering is one of the fundamental unsupervised learning methods widely used in data analysis. Sparse spectral clustering (SSC) imposes sparsity to the spectral clustering and it improves the interpretability of the model. This paper considers a widely adopted model for SSC, which can be formulated as an optimization problem over the Stiefel manifold with nonsmooth and nonconvex objective. Such an optimization problem is very challenging to solve. Existing methods usually solve its convex relaxation or need to smooth its nonsmooth part using certain smoothing techniques. In this paper, we propose a manifold proximal linear method (ManPL) that solves the original SSC formulation. We also extend the algorithm to solve the multiple-kernel SSC problems, for which an alternating ManPL algorithm is proposed. Convergence and iteration complexity results of the proposed methods are established. We demonstrate the advantage of our proposed methods over existing methods via the single-cell RNA sequencing data analysis

arXiv.org e-Print Archive

A Stochastic Subgradient Method for Nonsmooth Nonconvex Multi-Level Composition Optimization

Author: Ruszczynski Andrzej
Publication venue
Publication date: 18/12/2020
Field of study

We propose a single time-scale stochastic subgradient method for constrained optimization of a composition of several nonsmooth and nonconvex functions. The functions are assumed to be locally Lipschitz and differentiable in a generalized sense. Only stochastic estimates of the values and generalized derivatives of the functions are used. The method is parameter-free. We prove convergence with probability one of the method, by associating with it a system of differential inclusions and devising a nondifferentiable Lyapunov function for this system. For problems with functions having Lipschitz continuous derivatives, the method finds a point satisfying an optimality measure with error of order

1/\sqrt{N}

, after executing

N

iterations with constant stepsize

arXiv.org e-Print Archive