9,534 research outputs found
Pivotal estimation via square-root Lasso in nonparametric regression
We propose a self-tuning method that simultaneously
resolves three important practical problems in high-dimensional regression
analysis, namely it handles the unknown scale, heteroscedasticity and (drastic)
non-Gaussianity of the noise. In addition, our analysis allows for badly
behaved designs, for example, perfectly collinear regressors, and generates
sharp bounds even in extreme cases, such as the infinite variance case and the
noiseless case, in contrast to Lasso. We establish various nonasymptotic bounds
for including prediction norm rate and sparsity. Our
analysis is based on new impact factors that are tailored for bounding
prediction norm. In order to cover heteroscedastic non-Gaussian noise, we rely
on moderate deviation theory for self-normalized sums to achieve Gaussian-like
results under weak conditions. Moreover, we derive bounds on the performance of
ordinary least square (ols) applied to the model selected by accounting for possible misspecification of the selected model. Under
mild conditions, the rate of convergence of ols post
is as good as 's rate. As an application, we consider
the use of and ols post as
estimators of nuisance parameters in a generic semiparametric problem
(nonlinear moment condition or -problem), resulting in a construction of
-consistent and asymptotically normal estimators of the main
parameters.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1204 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Sharp thresholds for high-dimensional and noisy recovery of sparsity
The problem of consistently estimating the sparsity pattern of a vector
\betastar \in \real^\mdim based on observations contaminated by noise arises
in various contexts, including subset selection in regression, structure
estimation in graphical models, sparse approximation, and signal denoising. We
analyze the behavior of -constrained quadratic programming (QP), also
referred to as the Lasso, for recovering the sparsity pattern. Our main result
is to establish a sharp relation between the problem dimension \mdim, the
number \spindex of non-zero elements in \betastar, and the number of
observations \numobs that are required for reliable recovery. For a broad
class of Gaussian ensembles satisfying mutual incoherence conditions, we
establish existence and compute explicit values of thresholds \ThreshLow and
\ThreshUp with the following properties: for any , if \numobs
> 2 (\ThreshUp + \epsilon) \log (\mdim - \spindex) + \spindex + 1, then the
Lasso succeeds in recovering the sparsity pattern with probability converging
to one for large problems, whereas for \numobs < 2 (\ThreshLow - \epsilon)
\log (\mdim - \spindex) + \spindex + 1, then the probability of successful
recovery converges to zero. For the special case of the uniform Gaussian
ensemble, we show that \ThreshLow = \ThreshUp = 1, so that the threshold is
sharp and exactly determined.Comment: Appeared as Technical Report 708, Department of Statistics, UC
Berkele
Large-scale Nonlinear Variable Selection via Kernel Random Features
We propose a new method for input variable selection in nonlinear regression.
The method is embedded into a kernel regression machine that can model general
nonlinear functions, not being a priori limited to additive models. This is the
first kernel-based variable selection method applicable to large datasets. It
sidesteps the typical poor scaling properties of kernel methods by mapping the
inputs into a relatively low-dimensional space of random features. The
algorithm discovers the variables relevant for the regression task together
with learning the prediction model through learning the appropriate nonlinear
random feature maps. We demonstrate the outstanding performance of our method
on a set of large-scale synthetic and real datasets.Comment: Final version for proceedings of ECML/PKDD 201
Selective inference after feature selection via multiscale bootstrap
It is common to show the confidence intervals or -values of selected
features, or predictor variables in regression, but they often involve
selection bias. The selective inference approach solves this bias by
conditioning on the selection event. Most existing studies of selective
inference consider a specific algorithm, such as Lasso, for feature selection,
and thus they have difficulties in handling more complicated algorithms.
Moreover, existing studies often consider unnecessarily restrictive events,
leading to over-conditioning and lower statistical power. Our novel and
widely-applicable resampling method addresses these issues to compute an
approximately unbiased selective -value for the selected features. We prove
that the -value computed by our resampling method is more accurate and more
powerful than existing methods, while the computational cost is the same order
as the classical bootstrap method. Numerical experiments demonstrate that our
algorithm works well even for more complicated feature selection methods such
as non-convex regularization.Comment: The title has changed (The previous title is "Selective inference
after variable selection via multiscale bootstrap"). 23 pages, 11 figure
- …