13 research outputs found

    Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

    Full text link
    Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of deep neural networks. In this article, we present a relatively new mathematical framework that provides the beginning of a deeper understanding of deep learning. This framework precisely characterizes the functional properties of neural networks that are trained to fit to data. The key mathematical tools which support this framework include transform-domain sparse regularization, the Radon transform of computed tomography, and approximation theory, which are all techniques deeply rooted in signal processing. This framework explains the effect of weight decay regularization in neural network training, the use of skip connections and low-rank weight matrices in network architectures, the role of sparsity in neural networks, and explains why neural networks can perform well in high-dimensional problems

    On the Uniqueness of Inverse Problems with Fourier-domain Measurements and Generalized TV Regularization

    Full text link
    We study the super-resolution problem of recovering a periodic continuous-domain function from its low-frequency information. This means that we only have access to possibly corrupted versions of its Fourier samples up to a maximum cut-off frequency. The reconstruction task is specified as an optimization problem with generalized total-variation regularization involving a pseudo-differential operator. Our special emphasis is on the uniqueness of solutions. We show that, for elliptic regularization operators (e.g., the derivatives of any order), uniqueness is always guaranteed. To achieve this goal, we provide a new analysis of constrained optimization problems over Radon measures. We demonstrate that either the solutions are always made of Radon measures of constant sign, or the solution is unique. Doing so, we identify a general sufficient condition for the uniqueness of the solution of a constrained optimization problem with TV-regularization, expressed in terms of the Fourier samples.Comment: 20 page

    On the Prediction Performance of the Lasso

    Full text link
    Although the Lasso has been extensively studied, the relationship between its prediction performance and the correlations of the covariates is not fully understood. In this paper, we give new insights into this relationship in the context of multiple linear regression. We show, in particular, that the incorporation of a simple correlation measure into the tuning parameter can lead to a nearly optimal prediction performance of the Lasso even for highly correlated covariates. However, we also reveal that for moderately correlated covariates, the prediction performance of the Lasso can be mediocre irrespective of the choice of the tuning parameter. We finally show that our results also lead to near-optimal rates for the least-squares estimator with total variation penalty

    Structured sparsity with convex penalty functions

    Get PDF
    We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. This problem is relevant in Machine Learning, Statistics and Signal Processing. It is well known that a linear regression can benefit from knowledge that the underlying regression vector is sparse. The combinatorial problem of selecting the nonzero components of this vector can be “relaxed” by regularising the squared error with a convex penalty function like the ℓ1 norm. However, in many applications, additional conditions on the structure of the regression vector and its sparsity pattern are available. Incorporating this information into the learning method may lead to a significant decrease of the estimation error. In this thesis, we present a family of convex penalty functions, which encode prior knowledge on the structure of the vector formed by the absolute values of the regression coefficients. This family subsumes the ℓ1 norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish several properties of these penalty functions and discuss some examples where they can be computed explicitly. Moreover, for solving the regularised least squares problem with these penalty functions, we present a convergent optimisation algorithm and proximal method. Both algorithms are useful numerical techniques taylored for different kinds of penalties. Extensive numerical simulations highlight the benefit of structured sparsity and the advantage offered by our approach over the Lasso method and other related methods, such as using other convex optimisation penalties or greedy methods
    corecore