Search CORE

3,000 research outputs found

DC Proximal Newton for Non-Convex Optimization Problems

Author: Flamary Remi
Gasso Gilles
Rakotomamonjy Alain
Publication venue
Publication date: 01/01/2015
Field of study

We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are non-convex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm {admit} as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and non-convex regularizer. We have also illustrated the benefit of our algorithm in high-dimensional transductive learning problem where both loss function and regularizers are non-convex

arXiv.org e-Print Archive

HAL - Normandie Université

Optimization with Sparsity-Inducing Penalties

Author: Bach Francis
Jenatton Rodolphe
Mairal Julien
Obozinski Guillaume
Publication venue
Publication date: 01/01/2011
Field of study

Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted

\ell_2

-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

A quasi-Newton proximal splitting method

Author: Becker Stephen
Fadili M. Jalal
Publication venue
Publication date: 06/06/2012
Field of study

A new result in convex analysis on the calculation of proximity operators in certain scaled norms is derived. We describe efficient implementations of the proximity calculation for a useful class of functions; the implementations exploit the piece-wise linear nature of the dual problem. The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method. The optimization method compares favorably against state-of-the-art alternatives. The algorithm has extensive applications including signal processing, sparse recovery and machine learning and classification

arXiv.org e-Print Archive

HAL - Normandie Université

Hal-Diderot

Smoothing Proximal Gradient Method for General Structured Sparse Learning

Author: Eric
Jaime G. Carbonell
P. Xing
Qihang Lin
Seyoung Kim
Xi Chen
Publication venue
Publication date: 01/01/2010
Field of study

We study the problem of learning high dimensional regression models regularized by a structured-sparsity-inducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping group lasso penalty, based on the l1/l2 mixed-norm penalty, and 2) graph-guided fusion penalty. For both types of penalties, due to their non-separability, developing an efficient optimization method has remained a challenging problem. In this paper, we propose a general optimization approach, called smoothing proximal gradient method, which can solve the structured sparse regression problems with a smooth convex loss and a wide spectrum of structured-sparsity-inducing penalties. Our approach is based on a general smoothing technique of Nesterov. It achieves a convergence rate faster than the standard first-order method, subgradient method, and is much more scalable than the most widely used interior-point method. Numerical results are reported to demonstrate the efficiency and scalability of the proposed method.Comment: arXiv admin note: substantial text overlap with arXiv:1005.471

arXiv.org e-Print Archive

CiteSeerX