Search CORE

7 research outputs found

Model Consistency for Learning with Mirror-Stratifiable Regularizers

Author: Fadili Jalal
Garrigos Guillaume
Malick Jérome
Peyré Gabriel
Publication venue
Publication date: 16/01/2019
Field of study

Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regularizers. This class of regularizers encompasses the most well-known in the literature, including the

\ell_1

or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure

arXiv.org e-Print Archive

HAL - Normandie Université

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Model Consistency for Learning with Mirror-Stratifiable Regularizers

Author: Fadili Jalal M.
Garrigos Guillaume
Malick Jérôme
Peyré Gabriel
Publication venue: HAL CCSD
Publication date: 16/04/2019
Field of study

International audienceLow-complexity non-smooth convex regular-izers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regular-izers. This class of regularizers encompasses the most well-known in the literature, including the 1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms

INRIA a CCSD electronic archive server

Nonsmoothness in Machine Learning: specific structure, proximal identification, and applications

Author: Iutzeler Franck
Malick Jérôme
Publication venue
Publication date: 10/11/2020
Field of study

Nonsmoothness is often a curse for optimization; but it is sometimes a blessing, in particular for applications in machine learning. In this paper, we present the specific structure of nonsmooth optimization problems appearing in machine learning and illustrate how to leverage this structure in practice, for compression, acceleration, or dimension reduction. We pay a special attention to the presentation to make it concise and easily accessible, with both simple examples and general results

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Proximal Gradient methods with Adaptive Subspace Sampling

Author: Grishchenko Dmitry
Iutzeler Franck
Malick Jérôme
Publication venue
Publication date: 28/04/2020
Field of study

Many applications in machine learning or signal processing involve nonsmooth optimization problems. This nonsmoothness brings a low-dimensional structure to the optimal solutions. In this paper, we propose a randomized proximal gradient method harnessing this underlying structure. We introduce two key components: i) a random subspace proximal gradient algorithm; ii) an identification-based sampling of the subspaces. Their interplay brings a significant performance improvement on typical learning problems in terms of dimensions explored

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server