370 research outputs found

    Ridge Regularized Estimation of VAR Models for Inference

    Full text link
    Ridge regression is a popular regularization method that has wide applicability, as many regression problems can be cast in this form. However, ridge is only seldom applied in the estimation of vector autoregressive models, even though it naturally arises in Bayesian time series modeling. In this work, ridge regression is studied in the context of process estimation and inference of VARs. The effects of shrinkage are analyzed and asymptotic theory is derived enabling inference. Frequentist and Bayesian ridge approaches are compared. Finally, the estimation of impulse response functions is evaluated with Monte Carlo simulations and ridge regression is compared with a number of similar and competing methods.Comment: Streamlined exposition from previous draft. Moved proofs to Appendi

    l0 Sparse signal processing and model selection with applications

    Full text link
    Sparse signal processing has far-reaching applications including compressed sensing, media compression/denoising/deblurring, microarray analysis and medical imaging. The main reason for its popularity is that many signals have a sparse representation given that the basis is suitably selected. However the difficulty lies in developing an efficient method of recovering such a representation. To this aim, two efficient sparse signal recovery algorithms are developed in the first part of this thesis. The first method is based on direct minimization of the l0 norm via cyclic descent, which is called the L0LS-CD (l0 penalized least squares via cyclic descent) algorithm. The other method minimizes smooth approximations of sparsity measures including those of the l0 norm via the majorization minimization (MM) technique, which is called the QC (quadratic concave) algorithm. The L0LS-CD algorithm is developed further by extending it to its multivariate (V-L0LS-CD (vector L0LS-CD)) and group (gL0LS-CD (group L0LS-CD)) regression variants. Computational speed-ups to the basic cyclic descent algorithm are discussed and a greedy version of L0LS-CD is developed. Stability of these algorithms is analyzed and the impact of the penalty parameter and proper initialization on the algorithm performance are highlighted. A suitable method for performance comparison of sparse approximating algorithms in the presence of noise is established. Simulations compare L0LS-CD and V-L0LS-CD with a range of alternatives on under-determined as well as over-determined systems. The QC algorithm is applicable to a class of penalties that are neither convex nor concave but have what we call the quadratic concave property. Convergence proofs of this algorithm are presented and it is compared with the Newton algorithm, concave convex (CC) procedure, as well as with the class of proximity algorithms. Simulations focus on the smooth approximations of the l0 norm and compare them with other l0 denoising algorithms. Next, two applications of sparse modeling are considered. In the first application the L0LS-CD algorithm is extended to recover a sparse transfer function in the presence of coloured noise. The second uses gL0LS-CD to recover the topology of a sparsely connected network of dynamic systems. Both applications use Laguerre basis functions for model expansion. The role of model selection in sparse signal processing is widely neglected in literature. The tuning/penalty parameter of a sparse approximating problem should be selected using a model selection criterion which minimizes a desired discrepancy measure. Compared to the commonly used model selection methods, the SURE (Stein's unbiased risk estimator) estimator stands out as one which does not suffer from the limitations of other methods. Most model selection criterion are developed based on signal or prediction mean squared error. The last section of this thesis develops an SURE criterion instead for parameter mean square error and applies this result to l1 penalized least squares problem with grouped variables. Simulations based on topology identification of a sparse network are presented to illustrate and compare with alternative model selection criteria

    Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

    Full text link
    This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local L0L_0-penalized Cox regression via repeatedly performing reweighted L2L_2-penalized Cox regression. We show that the resulting estimator enjoys the best of L0L_0- and L2L_2-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and possesses a grouping property for highly correlated covariates. Simulation results suggest that when the sample size is large, the proposed method with pre-specified tuning parameters has a comparable or better performance than some popular penalized regression methods. More importantly, because the method naturally enables adaptation of efficient algorithms for massive L2L_2-penalized optimization and does not require costly data driven tuning parameter selection, it has a significant computational advantage for sHDMSS data, offering an average of 5-fold speedup over its closest competitor in empirical studies

    Time-Varying Parameters as Ridge Regressions

    Full text link
    Time-varying parameters (TVPs) models are frequently used in economics to model structural change. I show that they are in fact ridge regressions. Instantly, this makes computations, tuning, and implementation much easier than in the state-space paradigm. Among other things, solving the equivalent dual ridge problem is computationally very fast even in high dimensions, and the crucial "amount of time variation" is tuned by cross-validation. Evolving volatility is dealt with using a two-step ridge regression. I consider extensions that incorporate sparsity (the algorithm selects which parameters vary and which do not) and reduced-rank restrictions (variation is tied to a factor model). To demonstrate the usefulness of the approach, I use it to study the evolution of monetary policy in Canada. The application requires the estimation of about 4600 TVPs, a task well within the reach of the new method
    • …
    corecore