1,357 research outputs found

    Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

    Full text link
    This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local L0L_0-penalized Cox regression via repeatedly performing reweighted L2L_2-penalized Cox regression. We show that the resulting estimator enjoys the best of L0L_0- and L2L_2-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and possesses a grouping property for highly correlated covariates. Simulation results suggest that when the sample size is large, the proposed method with pre-specified tuning parameters has a comparable or better performance than some popular penalized regression methods. More importantly, because the method naturally enables adaptation of efficient algorithms for massive L2L_2-penalized optimization and does not require costly data driven tuning parameter selection, it has a significant computational advantage for sHDMSS data, offering an average of 5-fold speedup over its closest competitor in empirical studies

    Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models

    Get PDF
    Penalization of the likelihood by Jeffreys' invariant prior, or by a positive power thereof, is shown to produce finite-valued maximum penalized likelihood estimates in a broad class of binomial generalized linear models. The class of models includes logistic regression, where the Jeffreys-prior penalty is known additionally to reduce the asymptotic bias of the maximum likelihood estimator; and also models with other commonly used link functions such as probit and log-log. Shrinkage towards equiprobability across observations, relative to the maximum likelihood estimator, is established theoretically and is studied through illustrative examples. Some implications of finiteness and shrinkage for inference are discussed, particularly when inference is based on Wald-type procedures. A widely applicable procedure is developed for computation of maximum penalized likelihood estimates, by using repeated maximum likelihood fits with iteratively adjusted binomial responses and totals. These theoretical results and methods underpin the increasingly widespread use of reduced-bias and similarly penalized binomial regression models in many applied fields

    Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

    Full text link
    Principal component analysis (PCA) is widely used for dimensionality reduction, with well-documented merits in various applications involving high-dimensional data, including computer vision, preference measurement, and bioinformatics. In this context, the fresh look advocated here permeates benefits from variable selection and compressive sampling, to robustify PCA against outliers. A least-trimmed squares estimator of a low-rank bilinear factor analysis model is shown closely related to that obtained from an â„“0\ell_0-(pseudo)norm-regularized criterion encouraging sparsity in a matrix explicitly modeling the outliers. This connection suggests robust PCA schemes based on convex relaxation, which lead naturally to a family of robust estimators encompassing Huber's optimal M-class as a special case. Outliers are identified by tuning a regularization parameter, which amounts to controlling sparsity of the outlier matrix along the whole robustification path of (group) least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its neat ties to robust statistics, the developed outlier-aware PCA framework is versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal subspace robustly, as new data are acquired in real time; and ii) determine principal components robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests corroborate the effectiveness of the proposed robust PCA schemes, when used to identify aberrant responses in personality assessment surveys, as well as unveil communities in social networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin

    Maximum Entropy Vector Kernels for MIMO system identification

    Full text link
    Recent contributions have framed linear system identification as a nonparametric regularized inverse problem. Relying on â„“2\ell_2-type regularization which accounts for the stability and smoothness of the impulse response to be estimated, these approaches have been shown to be competitive w.r.t classical parametric methods. In this paper, adopting Maximum Entropy arguments, we derive a new â„“2\ell_2 penalty deriving from a vector-valued kernel; to do so we exploit the structure of the Hankel matrix, thus controlling at the same time complexity, measured by the McMillan degree, stability and smoothness of the identified models. As a special case we recover the nuclear norm penalty on the squared block Hankel matrix. In contrast with previous literature on reweighted nuclear norm penalties, our kernel is described by a small number of hyper-parameters, which are iteratively updated through marginal likelihood maximization; constraining the structure of the kernel acts as a (hyper)regularizer which helps controlling the effective degrees of freedom of our estimator. To optimize the marginal likelihood we adapt a Scaled Gradient Projection (SGP) algorithm which is proved to be significantly computationally cheaper than other first and second order off-the-shelf optimization methods. The paper also contains an extensive comparison with many state-of-the-art methods on several Monte-Carlo studies, which confirms the effectiveness of our procedure

    Generalized Sparse Covariance-based Estimation

    Full text link
    In this work, we extend the sparse iterative covariance-based estimator (SPICE), by generalizing the formulation to allow for different norm constraints on the signal and noise parameters in the covariance model. For a given norm, the resulting extended SPICE method enjoys the same benefits as the regular SPICE method, including being hyper-parameter free, although the choice of norms are shown to govern the sparsity in the resulting solution. Furthermore, we show that solving the extended SPICE method is equivalent to solving a penalized regression problem, which provides an alternative interpretation of the proposed method and a deeper insight on the differences in sparsity between the extended and the original SPICE formulation. We examine the performance of the method for different choices of norms, and compare the results to the original SPICE method, showing the benefits of using the extended formulation. We also provide two ways of solving the extended SPICE method; one grid-based method, for which an efficient implementation is given, and a gridless method for the sinusoidal case, which results in a semi-definite programming problem

    Robust regularized singular value decomposition with application to mortality data

    Get PDF
    We develop a robust regularized singular value decomposition (RobRSVD) method for analyzing two-way functional data. The research is motivated by the application of modeling human mortality as a smooth two-way function of age group and year. The RobRSVD is formulated as a penalized loss minimization problem where a robust loss function is used to measure the reconstruction error of a low-rank matrix approximation of the data, and an appropriately defined two-way roughness penalty function is used to ensure smoothness along each of the two functional domains. By viewing the minimization problem as two conditional regularized robust regressions, we develop a fast iterative reweighted least squares algorithm to implement the method. Our implementation naturally incorporates missing values. Furthermore, our formulation allows rigorous derivation of leave-one-row/column-out cross-validation and generalized cross-validation criteria, which enable computationally efficient data-driven penalty parameter selection. The advantages of the new robust method over nonrobust ones are shown via extensive simulation studies and the mortality rate application.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS649 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Successive Concave Sparsity Approximation for Compressed Sensing

    Full text link
    In this paper, based on a successively accuracy-increasing approximation of the â„“0\ell_0 norm, we propose a new algorithm for recovery of sparse vectors from underdetermined measurements. The approximations are realized with a certain class of concave functions that aggressively induce sparsity and their closeness to the â„“0\ell_0 norm can be controlled. We prove that the series of the approximations asymptotically coincides with the â„“1\ell_1 and â„“0\ell_0 norms when the approximation accuracy changes from the worst fitting to the best fitting. When measurements are noise-free, an optimization scheme is proposed which leads to a number of weighted â„“1\ell_1 minimization programs, whereas, in the presence of noise, we propose two iterative thresholding methods that are computationally appealing. A convergence guarantee for the iterative thresholding method is provided, and, for a particular function in the class of the approximating functions, we derive the closed-form thresholding operator. We further present some theoretical analyses via the restricted isometry, null space, and spherical section properties. Our extensive numerical simulations indicate that the proposed algorithm closely follows the performance of the oracle estimator for a range of sparsity levels wider than those of the state-of-the-art algorithms.Comment: Submitted to IEEE Trans. on Signal Processin
    • …
    corecore