30 research outputs found

    Universality Laws and Performance Analysis of the Generalized Linear Models

    Get PDF
    In the past couple of decades, non-smooth convex optimization has emerged as a powerful tool for the recovery of structured signals (sparse, low rank, etc.) from noisy linear or non-linear measurements in a variety of applications in genomics, signal processing, wireless communications, machine learning, etc.. Taking advantage of the particular structure of the unknown signal of interest is critical since in most of these applications, the dimension p of the signal to be estimated is comparable, or even larger than the number of observations n. With the advent of Compressive Sensing there has been a very large number of theoretical results that study the estimation performance of non-smooth convex optimization in such a high-dimensional setting. A popular approach for estimating an unknown signal β₀ ϵ ℝᵖ in a generalized linear model, with observations y = g(Xβ₀) ϵ ℝⁿ, is via solving the estimator β&#x0302; = arg minβ L(y, Xβ + λf(β). Here, L(•,•) is a loss function which is convex with respect to its second argument, and f(•) is a regularizer that enforces the structure of the unknown β₀. We first analyze the generalization error performance of this estimator, for the case where the entries of X are drawn independently from real standard Gaussian distribution. The precise nature of our analysis permits an accurate performance comparison between different instances of these estimators, and allows to optimally tune the hyperparameters based on the model parameters. We apply our result to some of the most popular cases of generalized linear models, such as M-estimators in linear regression, logistic regression and generalized margin maximizers in binary classification problems, and Poisson regression in count data models. The key ingredient of our proof is the Convex Gaussian Min-max Theorem (CGMT), which is a tight version of the Gaussian comparison inequality proved by Gordon in 1988. Unfortunately, having real iid entries in the features matrix X is crucial in this theorem, and it cannot be naturally extended to other cases. But for some special cases, we prove some universality properties and indirectly extend these results to more general designs of the features matrix X, where the entries are not necessarily real, independent, or identically distributed. This extension, enables us to analyze problems that CGMT was incapable of, such as models with quadratic measurements, phase-lift in phase retrieval, and data recovery in massive MIMO, and help us settle a few long standing open problems in these areas.</p
    corecore