13 research outputs found

    A Universal Analysis of Large-Scale Regularized Least Squares Solutions

    Get PDF
    A problem that has been of recent interest in statistical inference, machine learning and signal processing is that of understanding the asymptotic behavior of regularized least squares solutions under random measurement matrices (or dictionaries). The Least Absolute Shrinkage and Selection Operator (LASSO or least-squares with β„“_1 regularization) is perhaps one of the most interesting examples. Precise expressions for the asymptotic performance of LASSO have been obtained for a number of different cases, in particular when the elements of the dictionary matrix are sampled independently from a Gaussian distribution. It has also been empirically observed that the resulting expressions remain valid when the entries of the dictionary matrix are independently sampled from certain non-Gaussian distributions. In this paper, we confirm these observations theoretically when the distribution is sub-Gaussian. We further generalize the previous expressions for a broader family of regularization functions and under milder conditions on the underlying random, possibly non-Gaussian, dictionary matrix. In particular, we establish the universality of the asymptotic statistics (e.g., the average quadratic risk) of LASSO with non-Gaussian dictionaries

    A Universal Analysis of Large-Scale Regularized Least Squares Solutions

    Get PDF
    A problem that has been of recent interest in statistical inference, machine learning and signal processing is that of understanding the asymptotic behavior of regularized least squares solutions under random measurement matrices (or dictionaries). The Least Absolute Shrinkage and Selection Operator (LASSO or least-squares with β„“_1 regularization) is perhaps one of the most interesting examples. Precise expressions for the asymptotic performance of LASSO have been obtained for a number of different cases, in particular when the elements of the dictionary matrix are sampled independently from a Gaussian distribution. It has also been empirically observed that the resulting expressions remain valid when the entries of the dictionary matrix are independently sampled from certain non-Gaussian distributions. In this paper, we confirm these observations theoretically when the distribution is sub-Gaussian. We further generalize the previous expressions for a broader family of regularization functions and under milder conditions on the underlying random, possibly non-Gaussian, dictionary matrix. In particular, we establish the universality of the asymptotic statistics (e.g., the average quadratic risk) of LASSO with non-Gaussian dictionaries

    Deep Dictionary Learning: A PARametric NETwork Approach

    Full text link
    Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The dictionaries and classification parameters are trained by a classification objective, and the sparse features are extracted by reducing a reconstruction loss in each layer. The reconstruction objectives in some sense regularize the classification problem and inject source signal information in the extracted features. The performance of the proposed hierarchical method increases by adding more layers, which consequently makes this model easier to tune and adapt. The proposed algorithm furthermore, shows remarkably lower fooling rate in presence of adversarial perturbation. The validation of the proposed approach is based on its classification performance using four benchmark datasets and is compared to a CNN of similar size

    Asymptotic Analysis of ADMM for Compressed Sensing

    Full text link
    In this paper, we analyze the asymptotic behavior of alternating direction method of multipliers (ADMM) for compressed sensing, where we reconstruct an unknown structured signal from its underdetermined linear measurements. The analytical tool used in this paper is recently developed convex Gaussian min-max theorem (CGMT), which can be applied to various convex optimization problems to obtain its asymptotic error performance. In our analysis of ADMM, we analyze the convex subproblem in the update of ADMM and characterize the asymptotic distribution of the tentative estimate obtained at each iteration. The result shows that the update equations in ADMM can be decoupled into a scalar-valued stochastic process in the asymptotic regime with the large system limit. From the asymptotic result, we can predict the evolution of the error (e.g. mean-square-error (MSE) and symbol error rate (SER)) in ADMM for large-scale compressed sensing problems. Simulation results show that the empirical performance of ADMM and its theoretical prediction are close to each other in sparse vector reconstruction and binary vector reconstruction.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Double Descent in Random Feature Models: Precise Asymptotic Analysis for General Convex Regularization

    Full text link
    We prove rigorous results on the double descent phenomenon in random features (RF) model by employing the powerful Convex Gaussian Min-Max Theorem (CGMT) in a novel multi-level manner. Using this technique, we provide precise asymptotic expressions for the generalization of RF regression under a broad class of convex regularization terms including arbitrary separable functions. We further compute our results for the combination of β„“1\ell_1 and β„“2\ell_2 regularization case, known as elastic net, and present numerical studies about it. We numerically demonstrate the predictive capacity of our framework, and show experimentally that the predicted test error is accurate even in the non-asymptotic regime.Comment: 22 pages, 6 figure

    Noise Variance Estimation Using Asymptotic Residual in Compressed Sensing

    Full text link
    In compressed sensing, the measurement is usually contaminated by additive noise, and hence the information of the noise variance is often required to design algorithms. In this paper, we propose an estimation method for the unknown noise variance in compressed sensing problems. The proposed method called asymptotic residual matching (ARM) estimates the noise variance from a single measurement vector on the basis of the asymptotic result for the β„“1\ell_{1} optimization problem. Specifically, we derive the asymptotic residual corresponding to the β„“1\ell_{1} optimization and show that it depends on the noise variance. The proposed ARM approach obtains the estimate by comparing the asymptotic residual with the actual one, which can be obtained by the empirical reconstruction without the information of the noise variance. Simulation results show that the proposed noise variance estimation outperforms a conventional method based on the analysis of the ridge regularized least squares. We also show that, by using the proposed method, we can achieve good reconstruction performance in compressed sensing even when the noise variance is unknown.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    One-Bit Quantization and Sparsification for Multiclass Linear Classification via Regularized Regression

    Full text link
    We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, Ξ»f(w)\lambda f(w), for some convex function f(β‹…)f(\cdot), to avoid overfitting the mislabeled data. In our analysis, we assume that the data is sampled from a Gaussian Mixture Model with equal class sizes, and that a proportion cc of the training labels is corrupted for each class. Under these assumptions, we prove that the best classification performance is achieved when f(β‹…)=βˆ₯β‹…βˆ₯22f(\cdot) = \|\cdot\|^2_2 and Ξ»β†’βˆž\lambda \to \infty. We then proceed to analyze the classification errors for f(β‹…)=βˆ₯β‹…βˆ₯1f(\cdot) = \|\cdot\|_1 and f(β‹…)=βˆ₯β‹…βˆ₯∞f(\cdot) = \|\cdot\|_\infty in the large Ξ»\lambda regime and notice that it is often possible to find sparse and one-bit solutions, respectively, that perform almost as well as the one corresponding to f(β‹…)=βˆ₯β‹…βˆ₯22f(\cdot) = \|\cdot\|_2^2

    Universality in Learning from Linear Measurements

    Get PDF
    We study the problem of recovering a structured signal from independently and identically drawn linear measurements. A convex penalty function f(β‹…) is considered which penalizes deviations from the desired structure, and signal recovery is performed by minimizing f(β‹…) subject to the linear measurement constraints. The main question of interest is to determine the minimum number of measurements that is necessary and sufficient for the perfect recovery of the unknown signal with high probability. Our main result states that, under some mild conditions on f(β‹…) and on the distribution from which the linear measurements are drawn, the minimum number of measurements required for perfect recovery depends only on the first and second order statistics of the measurement vectors. As a result, the required of number of measurements can be determining by studying measurement vectors that are Gaussian (and have the same mean vector and covariance matrix) for which a rich literature and comprehensive theory exists. As an application, we show that the minimum number of random quadratic measurements (also known as rank-one projections) required to recover a low rank positive semi-definite matrix is 3nr, where n is the dimension of the matrix and r is its rank. As a consequence, we settle the long standing open question of determining the minimum number of measurements required for perfect signal recovery in phase retrieval using the celebrated PhaseLift algorithm, and show it to be 3n
    corecore