Search CORE

49,545 research outputs found

Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations

Author: Chatzinotas Symeon
Ottersten Björn
Pesavento Marius
Yang Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/06/2018
Field of study

In this paper, we propose a successive convex approximation framework for sparse optimization where the nonsmooth regularization function in the objective function is nonconvex and it can be written as the difference of two convex functions. The proposed framework is based on a nontrivial combination of the majorization-minimization framework and the successive convex approximation framework proposed in literature for a convex regularization function. The proposed framework has several attractive features, namely, i) flexibility, as different choices of the approximate function lead to different type of algorithms; ii) fast convergence, as the problem structure can be better exploited by a proper choice of the approximate function and the stepsize is calculated by the line search; iii) low complexity, as the approximate function is convex and the line search scheme is carried out over a differentiable function; iv) guaranteed convergence to a stationary point. We demonstrate these features by two example applications in subspace learning, namely, the network anomaly detection problem and the sparse subspace clustering problem. Customizing the proposed framework by adopting the best-response type approximation, we obtain soft-thresholding with exact line search algorithms for which all elements of the unknown parameter are updated in parallel according to closed-form expressions. The attractive features of the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing, special issue in Robust Subspace Learnin

arXiv.org e-Print Archive

Stochastic Training of Neural Networks via Successive Convex Approximations

Author: Di Lorenzo Paolo
Scardapane Simone
Publication venue
Publication date: 15/06/2017
Field of study

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and Learning System

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Generalized Separable Nonnegative Matrix Factorization

Author: Gillis Nicolas
Pan Junjun
Publication venue
Publication date: 01/01/2019
Field of study

Nonnegative matrix factorization (NMF) is a linear dimensionality technique for nonnegative data with applications such as image analysis, text mining, audio source separation and hyperspectral unmixing. Given a data matrix

M

and a factorization rank

r

, NMF looks for a nonnegative matrix

W

with

r

columns and a nonnegative matrix

H

with

r

rows such that

M \approx WH

. NMF is NP-hard to solve in general. However, it can be computed efficiently under the separability assumption which requires that the basis vectors appear as data points, that is, that there exists an index set

\mathcal{K}

such that

W = M(:,\mathcal{K})

. In this paper, we generalize the separability assumption: We only require that for each rank-one factor

W(:,k)H(k,:)

for

k=1,2,\dots,r

, either

W(:,k) = M(:,j)

for some

j

H(k,:) = M(i,:)

for some

i

. We refer to the corresponding problem as generalized separable NMF (GS-NMF). We discuss some properties of GS-NMF and propose a convex optimization model which we solve using a fast gradient method. We also propose a heuristic algorithm inspired by the successive projection algorithm. To verify the effectiveness of our methods, we compare them with several state-of-the-art separable NMF algorithms on synthetic, document and image data sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the identifiability of the model, we have modified the first synthetic experiment, we have clarified some aspects of the contributio

arXiv.org e-Print Archive

An Inexact Successive Quadratic Approximation Method for Convex L-1 Regularized Optimization

Author: Byrd Richard H.
Nocedal Jorge
Oztoprak Figen
Publication venue
Publication date: 13/09/2013
Field of study

We study a Newton-like method for the minimization of an objective function that is the sum of a smooth convex function and an l-1 regularization term. This method, which is sometimes referred to in the literature as a proximal Newton method, computes a step by minimizing a piecewise quadratic model of the objective function. In order to make this approach efficient in practice, it is imperative to perform this inner minimization inexactly. In this paper, we give inexactness conditions that guarantee global convergence and that can be used to control the local rate of convergence of the iteration. Our inexactness conditions are based on a semi-smooth function that represents a (continuous) measure of the optimality conditions of the problem, and that embodies the soft-thresholding iteration. We give careful consideration to the algorithm employed for the inner minimization, and report numerical results on two test sets originating in machine learning

arXiv.org e-Print Archive

CiteSeerX

Inexact Block Coordinate Descent Algorithms for Nonsmooth Nonconvex Optimization

Author: Luo Zhi-Quan
Ottersten Björn
Pesavento Marius
Yang Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2019
Field of study

In this paper, we propose an inexact block coordinate descent algorithm for large-scale nonsmooth nonconvex optimization problems. At each iteration, a particular block variable is selected and updated by inexactly solving the original optimization problem with respect to that block variable. More precisely, a local approximation of the original optimization problem is solved. The proposed algorithm has several attractive features, namely, i) high flexibility, as the approximation function only needs to be strictly convex and it does not have to be a global upper bound of the original function; ii) fast convergence, as the approximation function can be designed to exploit the problem structure at hand and the stepsize is calculated by the line search; iii) low complexity, as the approximation subproblems are much easier to solve and the line search scheme is carried out over a properly constructed differentiable function; iv) guaranteed convergence of a subsequence to a stationary point, even when the objective function does not have a Lipschitz continuous gradient. Interestingly, when the approximation subproblem is solved by a descent algorithm, convergence of a subsequence to a stationary point is still guaranteed even if the approximation subproblem is solved inexactly by terminating the descent algorithm after a finite number of iterations. These features make the proposed algorithm suitable for large-scale problems where the dimension exceeds the memory and/or the processing capability of the existing hardware. These features are also illustrated by several applications in signal processing and machine learning, for instance, network anomaly detection and phase retrieval

arXiv.org e-Print Archive