570 research outputs found
Semismooth Newton Coordinate Descent Algorithm for Elastic-Net Penalized Huber Loss Regression and Quantile Regression
We propose an algorithm, semismooth Newton coordinate descent (SNCD), for the
elastic-net penalized Huber loss regression and quantile regression in high
dimensional settings. Unlike existing coordinate descent type algorithms, the
SNCD updates each regression coefficient and its corresponding subgradient
simultaneously in each iteration. It combines the strengths of the coordinate
descent and the semismooth Newton algorithm, and effectively solves the
computational challenges posed by dimensionality and nonsmoothness. We
establish the convergence properties of the algorithm. In addition, we present
an adaptive version of the "strong rule" for screening predictors to gain extra
efficiency. Through numerical experiments, we demonstrate that the proposed
algorithm is very efficient and scalable to ultra-high dimensions. We
illustrate the application via a real data example
The Influence Function of Penalized Regression Estimators
To perform regression analysis in high dimensions, lasso or ridge estimation
are a common choice. However, it has been shown that these methods are not
robust to outliers. Therefore, alternatives as penalized M-estimation or the
sparse least trimmed squares (LTS) estimator have been proposed. The robustness
of these regression methods can be measured with the influence function. It
quantifies the effect of infinitesimal perturbations in the data. Furthermore
it can be used to compute the asymptotic variance and the mean squared error.
In this paper we compute the influence function, the asymptotic variance and
the mean squared error for penalized M-estimators and the sparse LTS estimator.
The asymptotic biasedness of the estimators make the calculations nonstandard.
We show that only M-estimators with a loss function with a bounded derivative
are robust against regression outliers. In particular, the lasso has an
unbounded influence function.Comment: appears in Statistics: A Journal of Theoretical and Applied
Statistics, 201
Distributed Quantile Regression Analysis and a Group Variable Selection Method
This dissertation develops novel methodologies for distributed quantile regression analysis
for big data by utilizing a distributed optimization algorithm called the alternating direction
method of multipliers (ADMM). Specifically, we first write the penalized quantile regression
into a specific form that can be solved by the ADMM and propose numerical algorithms
for solving the ADMM subproblems. This results in the distributed QR-ADMM
algorithm. Then, to further reduce the computational time, we formulate the penalized
quantile regression into another equivalent ADMM form in which all the subproblems have
exact closed-form solutions and hence avoid iterative numerical methods. This results in the
single-loop QPADM algorithm that further improve on the computational efficiency of the
QR-ADMM. Both QR-ADMM and QPADM enjoy flexible parallelization by enabling data
splitting across both sample space and feature space, which make them especially appealing
for the case when both sample size n and feature dimension p are large.
Besides the QR-ADMM and QPADM algorithms for penalized quantile regression, we
also develop a group variable selection method by approximating the Bayesian information
criterion. Unlike existing penalization methods for feature selection, our proposed gMIC
algorithm is free of parameter tuning and hence enjoys greater computational efficiency.
Although the current version of gMIC focuses on the generalized linear model, it can be
naturally extended to the quantile regression for feature selection.
We provide theoretical analysis for our proposed methods. Specifically, we conduct numerical
convergence analysis for the QR-ADMM and QPADM algorithms, and provide
asymptotical theories and oracle property of feature selection for the gMIC method. All
our methods are evaluated with simulation studies and real data analysis
Smoothing ADMM for Sparse-Penalized Quantile Regression with Non-Convex Penalties
This paper investigates quantile regression in the presence of non-convex and
non-smooth sparse penalties, such as the minimax concave penalty (MCP) and
smoothly clipped absolute deviation (SCAD). The non-smooth and non-convex
nature of these problems often leads to convergence difficulties for many
algorithms. While iterative techniques like coordinate descent and local linear
approximation can facilitate convergence, the process is often slow. This
sluggish pace is primarily due to the need to run these approximation
techniques until full convergence at each step, a requirement we term as a
\emph{secondary convergence iteration}. To accelerate the convergence speed, we
employ the alternating direction method of multipliers (ADMM) and introduce a
novel single-loop smoothing ADMM algorithm with an increasing penalty
parameter, named SIAD, specifically tailored for sparse-penalized quantile
regression. We first delve into the convergence properties of the proposed SIAD
algorithm and establish the necessary conditions for convergence.
Theoretically, we confirm a convergence rate of
for the sub-gradient bound of augmented Lagrangian. Subsequently, we provide
numerical results to showcase the effectiveness of the SIAD algorithm. Our
findings highlight that the SIAD method outperforms existing approaches,
providing a faster and more stable solution for sparse-penalized quantile
regression
A General Family of Penalties for Combining Differing Types of Penalties in Generalized Structured Models
Penalized estimation has become an established tool for regularization and model selection in regression models.
A variety of penalties with specific features are available
and effective algorithms for specific penalties have been proposed.
But not much is available to fit models that call for a combination of different penalties.
When modeling rent data, which will be considered as an example, various types of predictors call for a combination of a Ridge, a grouped Lasso and a Lasso-type penalty within one model.
Algorithms that can deal with such problems, are in demand.
We propose to approximate penalties that are (semi-)norms of scalar linear transformations of the coefficient vector in generalized structured models.
The penalty is very general such that the Lasso, the fused Lasso, the Ridge, the smoothly clipped absolute deviation penalty (SCAD), the elastic net and many more penalties are embedded.
The approximation allows to combine all these penalties within one model.
The computation is based on conventional penalized iteratively re-weighted least squares (PIRLS) algorithms and hence, easy to implement.
Moreover, new penalties can be incorporated quickly.
The approach is also extended to penalties with vector based arguments; that is, to penalties with norms of linear transformations of the coefficient vector.
Some illustrative examples and the model for the Munich rent data show promising results
- …