Search CORE

948 research outputs found

Optimization with Sparsity-Inducing Penalties

Author: Bach Francis
Jenatton Rodolphe
Mairal Julien
Obozinski Guillaume
Publication venue
Publication date: 01/01/2011
Field of study

Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted

\ell_2

-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization

Author: Daneshmand Amir
Facchinei Francisco
Kungurtsev Vyacheslav
Scutari Gesualdo
Publication venue
Publication date: 02/09/2014
Field of study

We propose a decomposition framework for the parallel optimization of the sum of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly nonseparable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel \emph{parallel, hybrid random/deterministic} decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing local convex approximations of the original nonconvex function. To tackle with huge-scale problems, the (block) variables to be updated are chosen according to a \emph{mixed random and deterministic} procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results show that on huge-scale problems the proposed hybrid random/deterministic algorithm outperforms both random and deterministic schemes.Comment: The order of the authors is alphabetica

arXiv.org e-Print Archive

CiteSeerX

Archivio della ricerca- Università di Roma La Sapienza

CoCoA: A General Framework for Communication-Efficient Distributed Optimization

Author: Forte Simone
Jaggi Martin
Jordan Michael I.
Ma Chenxin
Smith Virginia
Takac Martin
Publication venue
Publication date: 21/06/2017
Field of study

The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for distributed computing environments, CoCoA, that has an efficient communication scheme and is applicable to a wide variety of problems in machine learning and signal processing. We extend the framework to cover general non-strongly-convex regularizers, including L1-regularized problems like lasso, sparse logistic regression, and elastic net regularization, and show how earlier work can be derived as a special case. We provide convergence guarantees for the class of convex regularized loss minimization objectives, leveraging a novel approach in handling non-strongly-convex regularizers and non-smooth loss functions. The resulting framework has markedly improved performance over state-of-the-art methods, as we illustrate with an extensive set of experiments on real distributed datasets

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Repository for Publications and Research Data

Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis

Author: Scheinberg Katya
Tang Xiaocheng
Publication venue
Publication date: 14/07/2015
Field of study

Recently several methods were proposed for sparse optimization which make careful use of second-order information [10, 28, 16, 3] to improve local convergence rates. These methods construct a composite quadratic approximation using Hessian information, optimize this approximation using a first-order method, such as coordinate descent and employ a line search to ensure sufficient descent. Here we propose a general framework, which includes slightly modified versions of existing algorithms and also a new algorithm, which uses limited memory BFGS Hessian approximations, and provide a novel global convergence rate analysis, which covers methods that solve subproblems via coordinate descent

arXiv.org e-Print Archive

CiteSeerX