570 research outputs found

    Semismooth Newton Coordinate Descent Algorithm for Elastic-Net Penalized Huber Loss Regression and Quantile Regression

    Full text link
    We propose an algorithm, semismooth Newton coordinate descent (SNCD), for the elastic-net penalized Huber loss regression and quantile regression in high dimensional settings. Unlike existing coordinate descent type algorithms, the SNCD updates each regression coefficient and its corresponding subgradient simultaneously in each iteration. It combines the strengths of the coordinate descent and the semismooth Newton algorithm, and effectively solves the computational challenges posed by dimensionality and nonsmoothness. We establish the convergence properties of the algorithm. In addition, we present an adaptive version of the "strong rule" for screening predictors to gain extra efficiency. Through numerical experiments, we demonstrate that the proposed algorithm is very efficient and scalable to ultra-high dimensions. We illustrate the application via a real data example

    The Influence Function of Penalized Regression Estimators

    Full text link
    To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore it can be used to compute the asymptotic variance and the mean squared error. In this paper we compute the influence function, the asymptotic variance and the mean squared error for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations nonstandard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.Comment: appears in Statistics: A Journal of Theoretical and Applied Statistics, 201

    Distributed Quantile Regression Analysis and a Group Variable Selection Method

    Get PDF
    This dissertation develops novel methodologies for distributed quantile regression analysis for big data by utilizing a distributed optimization algorithm called the alternating direction method of multipliers (ADMM). Specifically, we first write the penalized quantile regression into a specific form that can be solved by the ADMM and propose numerical algorithms for solving the ADMM subproblems. This results in the distributed QR-ADMM algorithm. Then, to further reduce the computational time, we formulate the penalized quantile regression into another equivalent ADMM form in which all the subproblems have exact closed-form solutions and hence avoid iterative numerical methods. This results in the single-loop QPADM algorithm that further improve on the computational efficiency of the QR-ADMM. Both QR-ADMM and QPADM enjoy flexible parallelization by enabling data splitting across both sample space and feature space, which make them especially appealing for the case when both sample size n and feature dimension p are large. Besides the QR-ADMM and QPADM algorithms for penalized quantile regression, we also develop a group variable selection method by approximating the Bayesian information criterion. Unlike existing penalization methods for feature selection, our proposed gMIC algorithm is free of parameter tuning and hence enjoys greater computational efficiency. Although the current version of gMIC focuses on the generalized linear model, it can be naturally extended to the quantile regression for feature selection. We provide theoretical analysis for our proposed methods. Specifically, we conduct numerical convergence analysis for the QR-ADMM and QPADM algorithms, and provide asymptotical theories and oracle property of feature selection for the gMIC method. All our methods are evaluated with simulation studies and real data analysis

    Smoothing ADMM for Sparse-Penalized Quantile Regression with Non-Convex Penalties

    Full text link
    This paper investigates quantile regression in the presence of non-convex and non-smooth sparse penalties, such as the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD). The non-smooth and non-convex nature of these problems often leads to convergence difficulties for many algorithms. While iterative techniques like coordinate descent and local linear approximation can facilitate convergence, the process is often slow. This sluggish pace is primarily due to the need to run these approximation techniques until full convergence at each step, a requirement we term as a \emph{secondary convergence iteration}. To accelerate the convergence speed, we employ the alternating direction method of multipliers (ADMM) and introduce a novel single-loop smoothing ADMM algorithm with an increasing penalty parameter, named SIAD, specifically tailored for sparse-penalized quantile regression. We first delve into the convergence properties of the proposed SIAD algorithm and establish the necessary conditions for convergence. Theoretically, we confirm a convergence rate of o(k−14)o\big({k^{-\frac{1}{4}}}\big) for the sub-gradient bound of augmented Lagrangian. Subsequently, we provide numerical results to showcase the effectiveness of the SIAD algorithm. Our findings highlight that the SIAD method outperforms existing approaches, providing a faster and more stable solution for sparse-penalized quantile regression

    A General Family of Penalties for Combining Differing Types of Penalties in Generalized Structured Models

    Get PDF
    Penalized estimation has become an established tool for regularization and model selection in regression models. A variety of penalties with specific features are available and effective algorithms for specific penalties have been proposed. But not much is available to fit models that call for a combination of different penalties. When modeling rent data, which will be considered as an example, various types of predictors call for a combination of a Ridge, a grouped Lasso and a Lasso-type penalty within one model. Algorithms that can deal with such problems, are in demand. We propose to approximate penalties that are (semi-)norms of scalar linear transformations of the coefficient vector in generalized structured models. The penalty is very general such that the Lasso, the fused Lasso, the Ridge, the smoothly clipped absolute deviation penalty (SCAD), the elastic net and many more penalties are embedded. The approximation allows to combine all these penalties within one model. The computation is based on conventional penalized iteratively re-weighted least squares (PIRLS) algorithms and hence, easy to implement. Moreover, new penalties can be incorporated quickly. The approach is also extended to penalties with vector based arguments; that is, to penalties with norms of linear transformations of the coefficient vector. Some illustrative examples and the model for the Munich rent data show promising results
    • …
    corecore