22 research outputs found

    Sparse Matrix Inversion with Scaled Lasso

    Full text link
    We propose a new method of learning a sparse nonnegative-definite target matrix. Our primary example of the target matrix is the inverse of a population covariance or correlation matrix. The algorithm first estimates each column of the target matrix by the scaled Lasso and then adjusts the matrix estimator to be symmetric. The penalty level of the scaled Lasso for each column is completely determined by data via convex minimization, without using cross-validation. We prove that this scaled Lasso method guarantees the fastest proven rate of convergence in the spectrum norm under conditions of weaker form than those in the existing analyses of other 1\ell_1 regularized algorithms, and has faster guaranteed rate of convergence when the ratio of the 1\ell_1 and spectrum norms of the target inverse matrix diverges to infinity. A simulation study demonstrates the computational feasibility and superb performance of the proposed method. Our analysis also provides new performance bounds for the Lasso and scaled Lasso to guarantee higher concentration of the error at a smaller threshold level than previous analyses, and to allow the use of the union bound in column-by-column applications of the scaled Lasso without an adjustment of the penalty level. In addition, the least squares estimation after the scaled Lasso selection is considered and proven to guarantee performance bounds similar to that of the scaled Lasso

    Scaled Sparse Linear Regression

    Full text link
    Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs little beyond the computation of a path or grid of the sparse regression estimator for penalty levels above a proper threshold. For the scaled lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the scaled lasso simultaneously yields an estimator for the noise level and an estimated coefficient vector satisfying certain oracle inequalities for prediction, the estimation of the noise level and the regression coefficients. These inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise level estimator, including certain cases where the number of variables is of greater order than the sample size. Parallel results are provided for the least squares estimation after model selection by the scaled lasso. Numerical results demonstrate the superior performance of the proposed methods over an earlier proposal of joint convex minimization.Comment: 20 page

    Calibrated Elastic Regularization in Matrix Completion

    Full text link
    This paper concerns the problem of matrix completion, which is to estimate a matrix from observations in a small subset of indices. We propose a calibrated spectrum elastic net method with a sum of the nuclear and Frobenius penalties and develop an iterative algorithm to solve the convex minimization problem. The iterative algorithm alternates between imputing the missing entries in the incomplete matrix by the current guess and estimating the matrix by a scaled soft-thresholding singular value decomposition of the imputed matrix until the resulting matrix converges. A calibration step follows to correct the bias caused by the Frobenius penalty. Under proper coherence conditions and for suitable penalties levels, we prove that the proposed estimator achieves an error bound of nearly optimal order and in proportion to the noise level. This provides a unified analysis of the noisy and noiseless matrix completion problems. Simulation results are presented to compare our proposal with previous ones.Comment: 9 pages; Advances in Neural Information Processing Systems, NIPS 201

    A statistical analysis of vaccine-adverse event data

    Get PDF
    Vaccination has been one of the most successful public health interventions to date, and the U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS) currently contains more than 500,000 reports for post-vaccination adverse events that occur after the administration of vaccines licensed in the United States. The VAERS dataset is huge, contains very large dimension nominal variables, and is complex due to multiple listing of vaccines and adverse symptoms in a single report. So far there has not been any statistical analysis conducted in attempting to identify the cross-board patterns on how all reported adverse symptoms are related to the vaccines.https://doi.org/10.1186/s12911-019-0818-

    Statistical methods for high-dimensional data and continuous glucose monitoring

    No full text
    This thesis contains two parts. The first part concerns three connected problems with high-dimensional data in Chapters 2-4. The second part, Chapter 5, provides dynamic Bayes models to improve the continuous glucose monitoring. In the first part, we propose a unified scale invariant method for the estimation of parameters in linear regression, precision matrix and partial correlation. In Chapter 2, scaled Lasso is introduced to jointly estimate regression coefficients and noise level with a gradient descent algorithm. Under mild regularity conditions, we derive oracle inequalities for the prediction and estimation of the noise level and regression coefficients. These oracle inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise level estimator, including certain cases where the number of variables is of greater order than the sample size. Chapter 3 considers the estimation of precision matrix, which is closely related to linear regression. The proposed estimator is constructed via the scaled Lasso, and guarantees the fastest convergence rate under the spectrum norm. Besides the estimation of high-dimensional objects, the estimation of low-dimensional functionals of high-dimensional objects is also of great interest. A rate minimax estimator of a high-dimensional parameter does not automatically yield rate minimax estimates of its low-dimensional functionals. We consider efficient estimation of partial correlation between individual pairs of variables in Chapter 4. Numerical results demonstrate the superior performance of the proposed methods. In the second part, we develop statistical methods to produce more accurate and precise estimates for continuous glucose monitoring. The continuous glucose monitor measures the glucose level via an electrochemical glucose biosensor, inserted into subcutaneous fat tissue, called interstitial space. We use dynamic Bayes models to incorporate the linear relationship between the blood glucose level and interstitial signal, the time series aspects of the data, and the variability depending on sensor age. The Bayes method has been tested and evaluated with an important large dataset, called ``Star I'', from Medtronic, Inc., composed of continuous monitoring of glucose and other measurements. The results show that the Bayesian blood glucose prediction outperforms the output of the continuous glucose monitor in the STAR 1 trial.Ph. D.Includes bibliographical referencesIncludes vitaby Tingni Su

    A graphical approach to the analysis of matrix completion

    Full text link

    Comments on: ℓ 1-penalization for mixture regression models

    Full text link
    corecore