7 research outputs found
Model Consistency for Learning with Mirror-Stratifiable Regularizers
Low-complexity non-smooth convex regularizers are routinely used to impose
some structure (such as sparsity or low-rank) on the coefficients for linear
predictors in supervised learning. Model consistency consists then in selecting
the correct structure (for instance support or rank) by regularized empirical
risk minimization.
It is known that model consistency holds under appropriate non-degeneracy
conditions. However such conditions typically fail for highly correlated
designs and it is observed that regularization methods tend to select larger
models.
In this work, we provide the theoretical underpinning of this behavior using
the notion of mirror-stratifiable regularizers. This class of regularizers
encompasses the most well-known in the literature, including the or
trace norms. It brings into play a pair of primal-dual models, which in turn
allows one to locate the structure of the solution using a specific dual
certificate.
We also show how this analysis is applicable to optimal solutions of the
learning problem, and also to the iterates computed by a certain class of
stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure
Model Consistency for Learning with Mirror-Stratifiable Regularizers
International audienceLow-complexity non-smooth convex regular-izers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regular-izers. This class of regularizers encompasses the most well-known in the literature, including the 1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms
Nonsmoothness in Machine Learning: specific structure, proximal identification, and applications
Nonsmoothness is often a curse for optimization; but it is sometimes a
blessing, in particular for applications in machine learning. In this paper, we
present the specific structure of nonsmooth optimization problems appearing in
machine learning and illustrate how to leverage this structure in practice, for
compression, acceleration, or dimension reduction. We pay a special attention
to the presentation to make it concise and easily accessible, with both simple
examples and general results
Proximal Gradient methods with Adaptive Subspace Sampling
Many applications in machine learning or signal processing involve nonsmooth
optimization problems. This nonsmoothness brings a low-dimensional structure to
the optimal solutions. In this paper, we propose a randomized proximal gradient
method harnessing this underlying structure. We introduce two key components:
i) a random subspace proximal gradient algorithm; ii) an identification-based
sampling of the subspaces. Their interplay brings a significant performance
improvement on typical learning problems in terms of dimensions explored
Model Consistency for Learning with Mirror-Stratifiable Regularizers
International audienceLow-complexity non-smooth convex regular-izers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regular-izers. This class of regularizers encompasses the most well-known in the literature, including the 1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms
Model Consistency for Learning with Mirror-Stratifiable Regularizers
International audienceLow-complexity non-smooth convex regular-izers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regular-izers. This class of regularizers encompasses the most well-known in the literature, including the 1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms
Model Consistency for Learning with Mirror-Stratifiable Regularizers
International audienceLow-complexity non-smooth convex regular-izers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regular-izers. This class of regularizers encompasses the most well-known in the literature, including the 1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms