74 research outputs found
The Bernstein Function: A Unifying Framework of Nonconvex Penalization in Sparse Estimation
In this paper we study nonconvex penalization using Bernstein functions.
Since the Bernstein function is concave and nonsmooth at the origin, it can
induce a class of nonconvex functions for high-dimensional sparse estimation
problems. We derive a threshold function based on the Bernstein penalty and
give its mathematical properties in sparsity modeling. We show that a
coordinate descent algorithm is especially appropriate for penalized regression
problems with the Bernstein penalty. Additionally, we prove that the Bernstein
function can be defined as the concave conjugate of a -divergence and
develop a conjugate maximization algorithm for finding the sparse solution.
Finally, we particularly exemplify a family of Bernstein nonconvex penalties
based on a generalized Gamma measure and conduct empirical analysis for this
family
Likelihood Adaptively Modified Penalties
A new family of penalty functions, adaptive to likelihood, is introduced for
model selection in general regression models. It arises naturally through
assuming certain types of prior distribution on the regression parameters. To
study stability properties of the penalized maximum likelihood estimator, two
types of asymptotic stability are defined. Theoretical properties, including
the parameter estimation consistency, model selection consistency, and
asymptotic stability, are established under suitable regularity conditions. An
efficient coordinate-descent algorithm is proposed. Simulation results and real
data analysis show that the proposed method has competitive performance in
comparison with existing ones.Comment: 42 pages, 4 figure
One-step estimator paths for concave regularization
The statistics literature of the past 15 years has established many favorable
properties for sparse diminishing-bias regularization: techniques which can
roughly be understood as providing estimation under penalty functions spanning
the range of concavity between and norms. However, lasso
-regularized estimation remains the standard tool for industrial `Big
Data' applications because of its minimal computational cost and the presence
of easy-to-apply rules for penalty selection. In response, this article
proposes a simple new algorithm framework that requires no more computation
than a lasso path: the path of one-step estimators (POSE) does penalized
regression estimation on a grid of decreasing penalties, but adapts
coefficient-specific weights to decrease as a function of the coefficient
estimated in the previous path step. This provides sparse diminishing-bias
regularization at no extra cost over the fastest lasso algorithms. Moreover,
our `gamma lasso' implementation of POSE is accompanied by a reliable heuristic
for the fit degrees of freedom, so that standard information criteria can be
applied in penalty selection. We also provide novel results on the distance
between weighted- and penalized predictors; this allows us to build
intuition about POSE and other diminishing-bias regularization schemes. The
methods and results are illustrated in extensive simulations and in application
of logistic regression to evaluating the performance of hockey players.Comment: Data and code are in the gamlr package for R. Supplemental appendix
is at https://github.com/TaddyLab/pose/raw/master/paper/supplemental.pd
Efficient Graph Laplacian Estimation by Proximal Newton
The Laplacian-constrained Gaussian Markov Random Field (LGMRF) is a common
multivariate statistical model for learning a weighted sparse dependency graph
from given data. This graph learning problem can be formulated as a maximum
likelihood estimation (MLE) of the precision matrix, subject to Laplacian
structural constraints, with a sparsity-inducing penalty term. This paper aims
to solve this learning problem accurately and efficiently. First, since the
commonly used -norm penalty is inappropriate in this setting and may
lead to a complete graph, we employ the nonconvex minimax concave penalty
(MCP), which promotes sparse solutions with lower estimation bias. Second, as
opposed to existing first-order methods for this problem, we develop a
second-order proximal Newton approach to obtain an efficient solver, utilizing
several algorithmic features, such as using Conjugate Gradients,
preconditioning, and splitting to active/free sets. Numerical experiments
demonstrate the advantages of the proposed method in terms of both
computational complexity and graph learning accuracy compared to existing
methods
- β¦