5,399 research outputs found
Iterative Log Thresholding
Sparse reconstruction approaches using the re-weighted l1-penalty have been
shown, both empirically and theoretically, to provide a significant improvement
in recovering sparse signals in comparison to the l1-relaxation. However,
numerical optimization of such penalties involves solving problems with
l1-norms in the objective many times. Using the direct link of reweighted
l1-penalties to the concave log-regularizer for sparsity, we derive a simple
prox-like algorithm for the log-regularized formulation. The proximal splitting
step of the algorithm has a closed form solution, and we call the algorithm
'log-thresholding' in analogy to soft thresholding for the l1-penalty.
We establish convergence results, and demonstrate that log-thresholding
provides more accurate sparse reconstructions compared to both soft and hard
thresholding. Furthermore, the approach can be directly extended to
optimization over matrices with penalty for rank (i.e. the nuclear norm penalty
and its re-weigthed version), where we suggest a singular-value
log-thresholding approach.Comment: 5 pages, 4 figure
Model Consistency for Learning with Mirror-Stratifiable Regularizers
Low-complexity non-smooth convex regularizers are routinely used to impose
some structure (such as sparsity or low-rank) on the coefficients for linear
predictors in supervised learning. Model consistency consists then in selecting
the correct structure (for instance support or rank) by regularized empirical
risk minimization.
It is known that model consistency holds under appropriate non-degeneracy
conditions. However such conditions typically fail for highly correlated
designs and it is observed that regularization methods tend to select larger
models.
In this work, we provide the theoretical underpinning of this behavior using
the notion of mirror-stratifiable regularizers. This class of regularizers
encompasses the most well-known in the literature, including the or
trace norms. It brings into play a pair of primal-dual models, which in turn
allows one to locate the structure of the solution using a specific dual
certificate.
We also show how this analysis is applicable to optimal solutions of the
learning problem, and also to the iterates computed by a certain class of
stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure
- …