48 research outputs found

    A hierarchical sparsity-smoothness Bayesian model for ℓ0 + ℓ1 + ℓ2 regularization

    Get PDF
    International audienceSparse signal/image recovery is a challenging topic that has captured a great interest during the last decades. To address the ill-posedness of the related inverse problem, regularization is often essential by using appropriate priors that promote the sparsity of the target signal/image. In this context, ℓ0 + ℓ1 regularization has been widely investigated. In this paper, we introduce a new prior accounting simultaneously for both sparsity and smoothness of restored signals. We use a Bernoulli-generalized Gauss-Laplace distribution to perform ℓ0 + ℓ1 + ℓ2 regularization in a Bayesian framework. Our results show the potential of the proposed approach especially in restoring the non-zero coefficients of the signal/image of interest

    Optimization with Sparsity-Inducing Penalties

    Get PDF
    Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted ℓ2\ell_2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

    Proximal Methods for Hierarchical Sparse Coding

    Get PDF
    Sparse coding consists in representing signals as sparse linear combinations of atoms selected from a dictionary. We consider an extension of this framework where the atoms are further assumed to be embedded in a tree. This is achieved using a recently introduced tree-structured sparse regularization norm, which has proven useful in several applications. This norm leads to regularized problems that are difficult to optimize, and we propose in this paper efficient algorithms for solving them. More precisely, we show that the proximal operator associated with this norm is computable exactly via a dual approach that can be viewed as the composition of elementary proximal operators. Our procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the L1-norm. Our method is efficient and scales gracefully to millions of variables, which we illustrate in two types of applications: first, we consider fixed hierarchical dictionaries of wavelets to denoise natural images. Then, we apply our optimization tools in the context of dictionary learning, where learned dictionary elements naturally organize in a prespecified arborescent structure, leading to a better performance in reconstruction of natural image patches. When applied to text documents, our method learns hierarchies of topics, thus providing a competitive alternative to probabilistic topic models

    Low Complexity Regularization of Linear Inverse Problems

    Full text link
    Inverse problems and regularization theory is a central theme in contemporary signal processing, where the goal is to reconstruct an unknown signal from partial indirect, and possibly noisy, measurements of it. A now standard method for recovering the unknown signal is to solve a convex optimization problem that enforces some prior knowledge about its structure. This has proved efficient in many problems routinely encountered in imaging sciences, statistics and machine learning. This chapter delivers a review of recent advances in the field where the regularization prior promotes solutions conforming to some notion of simplicity/low-complexity. These priors encompass as popular examples sparsity and group sparsity (to capture the compressibility of natural signals and images), total variation and analysis sparsity (to promote piecewise regularity), and low-rank (as natural extension of sparsity to matrix-valued data). Our aim is to provide a unified treatment of all these regularizations under a single umbrella, namely the theory of partial smoothness. This framework is very general and accommodates all low-complexity regularizers just mentioned, as well as many others. Partial smoothness turns out to be the canonical way to encode low-dimensional models that can be linear spaces or more general smooth manifolds. This review is intended to serve as a one stop shop toward the understanding of the theoretical properties of the so-regularized solutions. It covers a large spectrum including: (i) recovery guarantees and stability to noise, both in terms of ℓ2\ell^2-stability and model (manifold) identification; (ii) sensitivity analysis to perturbations of the parameters involved (in particular the observations), with applications to unbiased risk estimation ; (iii) convergence properties of the forward-backward proximal splitting scheme, that is particularly well suited to solve the corresponding large-scale regularized optimization problem

    Convex and Network Flow Optimization for Structured Sparsity

    Get PDF
    We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of l_infinity-norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.Comment: to appear in the Journal of Machine Learning Research (JMLR

    Adaptive Image Denoising by Targeted Databases

    Full text link
    We propose a data-dependent denoising procedure to restore noisy images. Different from existing denoising algorithms which search for patches from either the noisy image or a generic database, the new algorithm finds patches from a database that contains only relevant patches. We formulate the denoising problem as an optimal filter design problem and make two contributions. First, we determine the basis function of the denoising filter by solving a group sparsity minimization problem. The optimization formulation generalizes existing denoising algorithms and offers systematic analysis of the performance. Improvement methods are proposed to enhance the patch search process. Second, we determine the spectral coefficients of the denoising filter by considering a localized Bayesian prior. The localized prior leverages the similarity of the targeted database, alleviates the intensive Bayesian computation, and links the new method to the classical linear minimum mean squared error estimation. We demonstrate applications of the proposed method in a variety of scenarios, including text images, multiview images and face images. Experimental results show the superiority of the new algorithm over existing methods.Comment: 15 pages, 13 figures, 2 tables, journa

    Tensor Regression

    Full text link
    Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimensional extensions of vectors, are considered as natural representations of high dimensional data. In this book, the authors provide a systematic study and analysis of tensor-based regression models and their applications in recent years. It groups and illustrates the existing tensor-based regression methods and covers the basics, core ideas, and theoretical characteristics of most tensor-based regression methods. In addition, readers can learn how to use existing tensor-based regression methods to solve specific regression tasks with multiway data, what datasets can be selected, and what software packages are available to start related work as soon as possible. Tensor Regression is the first thorough overview of the fundamentals, motivations, popular algorithms, strategies for efficient implementation, related applications, available datasets, and software resources for tensor-based regression analysis. It is essential reading for all students, researchers and practitioners of working on high dimensional data.Comment: 187 pages, 32 figures, 10 table

    Imaging and uncertainty quantification in radio astronomy via convex optimization : when precision meets scalability

    Get PDF
    Upcoming radio telescopes such as the Square Kilometre Array (SKA) will provide sheer amounts of data, allowing large images of the sky to be reconstructed at an unprecedented resolution and sensitivity over thousands of frequency channels. In this regard, wideband radio-interferometric imaging consists in recovering a 3D image of the sky from incomplete and noisy Fourier data, that is a highly ill-posed inverse problem. To regularize the inverse problem, advanced prior image models need to be tailored. Moreover, the underlying algorithms should be highly parallelized to scale with the vast data volumes provided and the Petabyte image cubes to be reconstructed for SKA. The research developed in this thesis leverages convex optimization techniques to achieve precise and scalable imaging for wideband radio interferometry and further assess the degree of confidence in particular 3D structures present in the reconstructed cube. In the context of image reconstruction, we propose a new approach that decomposes the image cube into regular spatio-spectral facets, each is associated with a sophisticated hybrid prior image model. The approach is formulated as an optimization problem with a multitude of facet-based regularization terms and block-specific data-fidelity terms. The underpinning algorithmic structure benefits from well-established convergence guarantees and exhibits interesting functionalities such as preconditioning to accelerate the convergence speed. Furthermore, it allows for parallel processing of all data blocks and image facets over a multiplicity of CPU cores, allowing the bottleneck induced by the size of the image and data cubes to be efficiently addressed via parallelization. The precision and scalability potential of the proposed approach are confirmed through the reconstruction of a 15 GB image cube of the Cyg A radio galaxy. In addition, we propose a new method that enables analyzing the degree of confidence in particular 3D structures appearing in the reconstructed cube. This analysis is crucial due to the high ill-posedness of the inverse problem. Besides, it can help in making scientific decisions on the structures under scrutiny (e.g., confirming the existence of a second black hole in the Cyg A galaxy). The proposed method is posed as an optimization problem and solved efficiently with a modern convex optimization algorithm with preconditioning and splitting functionalities. The simulation results showcase the potential of the proposed method to scale to big data regimes