133 research outputs found
Quasi-Likelihood and/or Robust Estimation in High Dimensions
We consider the theory for the high-dimensional generalized linear model with
the Lasso. After a short review on theoretical results in literature, we
present an extension of the oracle results to the case of quasi-likelihood
loss. We prove bounds for the prediction error and -error. The results
are derived under fourth moment conditions on the error distribution. The case
of robust loss is also given. We moreover show that under an irrepresentable
condition, the -penalized quasi-likelihood estimator has no false
positives.Comment: Published in at http://dx.doi.org/10.1214/12-STS397 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Generative Quantile Regression with Variability Penalty
Quantile regression and conditional density estimation can reveal structure
that is missed by mean regression, such as multimodality and skewness. In this
paper, we introduce a deep learning generative model for joint quantile
estimation called Penalized Generative Quantile Regression (PGQR). Our approach
simultaneously generates samples from many random quantile levels, allowing us
to infer the conditional distribution of a response variable given a set of
covariates. Our method employs a novel variability penalty to avoid the problem
of vanishing variability, or memorization, in deep generative models. Further,
we introduce a new family of partial monotonic neural networks (PMNN) to
circumvent the problem of crossing quantile curves. A major benefit of PGQR is
that it can be fit using a single optimization, thus bypassing the need to
repeatedly train the model at multiple quantile levels or use computationally
expensive cross-validation to tune the penalty parameter. We illustrate the
efficacy of PGQR through extensive simulation studies and analysis of real
datasets. Code to implement our method is available at
https://github.com/shijiew97/PGQR.Comment: 41 pages, 17 figures, 4 tables. New version includes more simulation
studies, comparisons to competing methods, illustrations, real data
applications, and discussion of the vanishing variability phenomenon and
overparameterization in deep learning. The figures are higher-resolution, and
the presentation and writing have improve
Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming
We propose a pivotal method for estimating high-dimensional sparse linear
regression models, where the overall number of regressors is large,
possibly much larger than , but only regressors are significant. The
method is a modification of the lasso, called the square-root lasso. The method
is pivotal in that it neither relies on the knowledge of the standard deviation
or nor does it need to pre-estimate . Moreover, the method
does not rely on normality or sub-Gaussianity of noise. It achieves near-oracle
performance, attaining the convergence rate in
the prediction norm, and thus matching the performance of the lasso with known
. These performance results are valid for both Gaussian and
non-Gaussian errors, under some mild moment restrictions. We formulate the
square-root lasso as a solution to a convex conic programming problem, which
allows us to implement the estimator using efficient algorithmic methods, such
as interior-point and first-order methods
Feature selection guided by structural information
In generalized linear regression problems with an abundant number of
features, lasso-type regularization which imposes an -constraint on the
regression coefficients has become a widely established technique. Deficiencies
of the lasso in certain scenarios, notably strongly correlated design, were
unmasked when Zou and Hastie [J. Roy. Statist. Soc. Ser. B 67 (2005) 301--320]
introduced the elastic net. In this paper we propose to extend the elastic net
by admitting general nonnegative quadratic constraints as a second form of
regularization. The generalized ridge-type constraint will typically make use
of the known association structure of features, for example, by using temporal-
or spatial closeness. We study properties of the resulting "structured elastic
net" regression estimation procedure, including basic asymptotics and the issue
of model selection consistency. In this vein, we provide an analog to the
so-called "irrepresentable condition" which holds for the lasso. Moreover, we
outline algorithmic solutions for the structured elastic net within the
generalized linear model family. The rationale and the performance of our
approach is illustrated by means of simulated and real world data, with a focus
on signal regression.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS302 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …