20 research outputs found

    Learning Sparsity-Promoting Regularizers using Bilevel Optimization

    Full text link
    We present a method for supervised learning of sparsity-promoting regularizers for denoising signals and images. Sparsity-promoting regularization is a key ingredient in solving modern signal reconstruction problems; however, the operators underlying these regularizers are usually either designed by hand or learned from data in an unsupervised way. The recent success of supervised learning (mainly convolutional neural networks) in solving image reconstruction problems suggests that it could be a fruitful approach to designing regularizers. Towards this end, we propose to denoise signals using a variational formulation with a parametric, sparsity-promoting regularizer, where the parameters of the regularizer are learned to minimize the mean squared error of reconstructions on a training set of ground truth image and measurement pairs. Training involves solving a challenging bilievel optimization problem; we derive an expression for the gradient of the training loss using the closed-form solution of the denoising problem and provide an accompanying gradient descent algorithm to minimize it. Our experiments with structured 1D signals and natural images show that the proposed method can learn an operator that outperforms well-known regularizers (total variation, DCT-sparsity, and unsupervised dictionary learning) and collaborative filtering for denoising. While the approach we present is specific to denoising, we believe that it could be adapted to the larger class of inverse problems with linear measurement models, giving it applicability in a wide range of signal reconstruction settings

    A Riemannian approach to large-scale constrained least-squares with symmetries

    Full text link
    This thesis deals with least-squares optimization on a manifold of equivalence relations, e.g., in the presence of symmetries which arise frequently in many applications. While least-squares cost functions remain a popular way to model large-scale problems, the additional symmetry constraint should be interpreted as a way to make the modeling robust. Two fundamental examples are the matrix completion problem, a least-squares problem with rank constraints and the generalized eigenvalue problem, a least-squares problem with orthogonality constraints. The possible large-scale nature of these problems demands to exploit the problem structure as much as possible in order to design numerically efficient algorithms. The constrained least-squares problems are tackled in the framework of Riemannian optimization that has gained much popularity in recent years because of the special nature of orthogonality and rank constraints that have particular symmetries. Previous work on Riemannian optimization has mostly focused on the search space, exploiting the differential geometry of the constraint but disregarding the role of the cost function. We, on the other hand, propose to take both cost and constraints into account to propose a tailored Riemannian geometry. This is achieved by proposing novel Riemannian metrics. To this end, we show a basic connection between sequential quadratic programming and Riemannian gradient optimization and address the general question of selecting a metric in Riemannian optimization. We revisit quadratic optimization problems with orthogonality and rank constraints by generalizing various existing methods, like power, inverse and Rayleigh quotient iterations, and proposing novel ones that empirically compete with state-of-the-art algorithms. Overall, this thesis deals with exploiting two fundamental structures, least-squares and symmetry, in nonlinear optimization

    A Riemannian approach to large-scale constrained least-squares with symmetries

    Get PDF
    This thesis deals with least-squares optimization on a manifold of equivalence relations, e.g., in the presence of symmetries which arise frequently in many applications. While least-squares cost functions remain a popular way to model large-scale problems, the additional symmetry constraint should be interpreted as a way to make the modeling robust. Two fundamental examples are the matrix completion problem, a least-squares problem with rank constraints and the generalized eigenvalue problem, a least-squares problem with orthogonality constraints. The possible large-scale nature of these problems demands to exploit the problem structure as much as possible in order to design numerically efficient algorithms. The constrained least-squares problems are tackled in the framework of Riemannian optimization that has gained much popularity in recent years because of the special nature of orthogonality and rank constraints that have particular symmetries. Previous work on Riemannian optimization has mostly focused on the search space, exploiting the differential geometry of the constraint but disregarding the role of the cost function. We, on the other hand, propose to take both cost and constraints into account to propose a tailored Riemannian geometry. This is achieved by proposing novel Riemannian metrics. To this end, we show a basic connection between sequential quadratic programming and Riemannian gradient optimization and address the general question of selecting a metric in Riemannian optimization. We revisit quadratic optimization problems with orthogonality and rank constraints by generalizing various existing methods, like power, inverse and Rayleigh quotient iterations, and proposing novel ones that empirically compete with state-of-the-art algorithms. Overall, this thesis deals with exploiting two fundamental structures, least-squares and symmetry, in nonlinear optimization

    Regularization Methods for High-Dimensional Inference

    Get PDF
    High dimensionality is a common problem in statistical inference, and is becoming more prevalent in modern data analysis settings. While often data of interest may have a large -- often unmanageable -- dimension, modifications to various well-known techniques can be made to improve performance and aid interpretation. We typically assume that although predictors lie in a high-dimensional ambient space, they have a lower-dimensional structure that can be exploited through either prior knowledge or estimation. In performing regression, the structure in the predictors can be taken into account implicitly through regularization. In the case where the underlying structure in the predictors is known, using knowledge of this structure can yield improvements in prediction. We approach this problem through regularization using a known projection based on knowledge of the structure of the Grassmannian. Using this projection, we can obtain improvements over many classical and recent techniques in both regression and classification problems with only minor modification to a typical least squares problem. The structure of the predictors can also be taken into account explicitly through methods of dimension reduction. We often wish to have a lower-dimensional representation of our data in order to build potentially more interpretable models or to explore possible connections between predictors. In many problems, we are faced with data that does not have a similar distribution between estimating the model parameters and performing prediction. This results in problems when estimating a lower-dimensional structure of the predictors, as it may change. We pose methods for estimating a linear dimension reduction that will take into account these discrepancies between data distributions, while also incorporating as much of the information as possible in the data into construction of the predictor structure. These methods are built on regularized maximum likelihood and yield improvements in many cases of regression and classification, including those cases in which predictor dimension changes between training and testing

    Regularization Methods for High-Dimensional Inference

    Get PDF
    High dimensionality is a common problem in statistical inference, and is becoming more prevalent in modern data analysis settings. While often data of interest may have a large -- often unmanageable -- dimension, modifications to various well-known techniques can be made to improve performance and aid interpretation. We typically assume that although predictors lie in a high-dimensional ambient space, they have a lower-dimensional structure that can be exploited through either prior knowledge or estimation. In performing regression, the structure in the predictors can be taken into account implicitly through regularization. In the case where the underlying structure in the predictors is known, using knowledge of this structure can yield improvements in prediction. We approach this problem through regularization using a known projection based on knowledge of the structure of the Grassmannian. Using this projection, we can obtain improvements over many classical and recent techniques in both regression and classification problems with only minor modification to a typical least squares problem. The structure of the predictors can also be taken into account explicitly through methods of dimension reduction. We often wish to have a lower-dimensional representation of our data in order to build potentially more interpretable models or to explore possible connections between predictors. In many problems, we are faced with data that does not have a similar distribution between estimating the model parameters and performing prediction. This results in problems when estimating a lower-dimensional structure of the predictors, as it may change. We pose methods for estimating a linear dimension reduction that will take into account these discrepancies between data distributions, while also incorporating as much of the information as possible in the data into construction of the predictor structure. These methods are built on regularized maximum likelihood and yield improvements in many cases of regression and classification, including those cases in which predictor dimension changes between training and testing
    corecore