79 research outputs found

    The Degrees of Freedom of the Group Lasso

    Full text link
    This paper studies the sensitivity to the observations of the block/group Lasso solution to an overdetermined linear regression model. Such a regularization is known to promote sparsity patterns structured as nonoverlapping groups of coefficients. Our main contribution provides a local parameterization of the solution with respect to the observations. As a byproduct, we give an unbiased estimate of the degrees of freedom of the group Lasso. Among other applications of such results, one can choose in a principled and objective way the regularization parameter of the Lasso through model selection criteria

    The degrees of freedom of the Lasso for general design matrix

    Full text link
    In this paper, we investigate the degrees of freedom (\dof) of penalized ℓ1\ell_1 minimization (also known as the Lasso) for linear regression models. We give a closed-form expression of the \dof of the Lasso response. Namely, we show that for any given Lasso regularization parameter λ\lambda and any observed data yy belonging to a set of full (Lebesgue) measure, the cardinality of the support of a particular solution of the Lasso problem is an unbiased estimator of the degrees of freedom. This is achieved without the need of uniqueness of the Lasso solution. Thus, our result holds true for both the underdetermined and the overdetermined case, where the latter was originally studied in \cite{zou}. We also show, by providing a simple counterexample, that although the \dof theorem of \cite{zou} is correct, their proof contains a flaw since their divergence formula holds on a different set of a full measure than the one that they claim. An effective estimator of the number of degrees of freedom may have several applications including an objectively guided choice of the regularization parameter in the Lasso through the \sure framework. Our theoretical findings are illustrated through several numerical simulations.Comment: A short version appeared in SPARS'11, June 2011 Previously entitled "The degrees of freedom of penalized l1 minimization

    Heavy Ball Momentum for Non-Strongly Convex Optimization

    Full text link
    When considering the minimization of a quadratic or strongly convex function, it is well known that first-order methods involving an inertial term weighted by a constant-in-time parameter are particularly efficient (see Polyak [32], Nesterov [28], and references therein). By setting the inertial parameter according to the condition number of the objective function, these methods guarantee a fast exponential decay of the error. We prove that this type of schemes (which are later called Heavy Ball schemes) is relevant in a relaxed setting, i.e. for composite functions satisfying a quadratic growth condition. In particular, we adapt V-FISTA, introduced by Beck in [10] for strongly convex functions, to this broader class of functions. To the authors' knowledge, the resulting worst-case convergence rates are faster than any other in the literature, including those of FISTA restart schemes. No assumption on the set of minimizers is required and guarantees are also given in the non-optimal case, i.e. when the condition number is not exactly known. This analysis follows the study of the corresponding continuous-time dynamical system (Heavy Ball with friction system), for which new convergence results of the trajectory are shown

    Risk estimation for matrix recovery with spectral regularization

    Full text link
    In this paper, we develop an approach to recursively estimate the quadratic risk for matrix recovery problems regularized with spectral functions. Toward this end, in the spirit of the SURE theory, a key step is to compute the (weak) derivative and divergence of a solution with respect to the observations. As such a solution is not available in closed form, but rather through a proximal splitting algorithm, we propose to recursively compute the divergence from the sequence of iterates. A second challenge that we unlocked is the computation of the (weak) derivative of the proximity operator of a spectral function. To show the potential applicability of our approach, we exemplify it on a matrix completion problem to objectively and automatically select the regularization parameter.Comment: This version is an update of our original paper presented at ICML'2012 workshop on Sparsity, Dictionaries and Projections in Machine Learning and Signal Processin

    An evaluation of the sparsity degree for sparse recovery with deterministic measurement matrices

    Get PDF
    International audienceThe paper deals with the estimation of the maximal sparsity degree for which a given measurement matrix allows sparse reconstruction through l1-minimization. This problem is a key issue in different applications featuring particular types of measurement matrices, as for instance in the framework of tomography with low number of views. In this framework, while the exact bound is NP hard to compute, most classical criteria guarantee lower bounds that are numerically too pessimistic. In order to achieve an accurate estimation, we propose an efficient greedy algorithm that provides an upper bound for this maximal sparsity. Based on polytope theory, the algorithm consists in finding sparse vectors that cannot be recovered by l1-minimization. Moreover, in order to deal with noisy measurements, theoretical conditions leading to a more restrictive but reasonable bounds are investigated. Numerical results are presented for discrete versions of tomo\-graphy measurement matrices, which are stacked Radon transforms corresponding to different tomograph views

    Parameter-Free FISTA by Adaptive Restart and Backtracking

    Full text link
    We consider a combined restarting and adaptive backtracking strategy for the popular Fast Iterative Shrinking-Thresholding Algorithm frequently employed for accelerating the convergence speed of large-scale structured convex optimization problems. Several variants of FISTA enjoy a provable linear convergence rate for the function values F(xn)F(x_n) of the form O(e−Kμ/L n)\mathcal{O}( e^{-K\sqrt{\mu/L}~n}) under the prior knowledge of problem conditioning, i.e. of the ratio between the (\L ojasiewicz) parameter μ\mu determining the growth of the objective function and the Lipschitz constant LL of its smooth component. These parameters are nonetheless hard to estimate in many practical cases. Recent works address the problem by estimating either parameter via suitable adaptive strategies. In our work both parameters can be estimated at the same time by means of an algorithmic restarting scheme where, at each restart, a non-monotone estimation of LL is performed. For this scheme, theoretical convergence results are proved, showing that a O(e−Kμ/Ln)\mathcal{O}( e^{-K\sqrt{\mu/L}n}) convergence speed can still be achieved along with quantitative estimates of the conditioning. The resulting Free-FISTA algorithm is therefore parameter-free. Several numerical results are reported to confirm the practical interest of its use in many exemplar problems

    A greedy algorithm to extract sparsity degree for l1/l0-equivalence in a deterministic context

    Get PDF
    International audienceThis paper investigates the problem of designing a deterministic system matrix, that is measurement matrix, for sparse recovery. An efficient greedy algorithm is proposed in order to extract the class of sparse signal/image which cannot be reconstructed by â„“1\ell_1-minimization for a fixed system matrix. Based on the polytope theory, the algorithm provides a geometric interpretation of the recovery condition considering the seminal work by Donoho. The paper presents an additional condition, extending the Fuchs/Tropp results, in order to deal with noisy measurements. Simulations are conducted for tomography-like imaging system in which the design of the system matrix is a difficult task consisting of the selection of the number of views according to the sparsity degree
    • …
    corecore