17,743 research outputs found
Conditional Transformation Models
The ultimate goal of regression analysis is to obtain information about the
conditional distribution of a response given a set of explanatory variables.
This goal is, however, seldom achieved because most established regression
models only estimate the conditional mean as a function of the explanatory
variables and assume that higher moments are not affected by the regressors.
The underlying reason for such a restriction is the assumption of additivity of
signal and noise. We propose to relax this common assumption in the framework
of transformation models. The novel class of semiparametric regression models
proposed herein allows transformation functions to depend on explanatory
variables. These transformation functions are estimated by regularised
optimisation of scoring rules for probabilistic forecasts, e.g. the continuous
ranked probability score. The corresponding estimated conditional distribution
functions are consistent. Conditional transformation models are potentially
useful for describing possible heteroscedasticity, comparing spatially varying
distributions, identifying extreme events, deriving prediction intervals and
selecting variables beyond mean regression effects. An empirical investigation
based on a heteroscedastic varying coefficient simulation model demonstrates
that semiparametric estimation of conditional distribution functions can be
more beneficial than kernel-based non-parametric approaches or parametric
generalised additive models for location, scale and shape
Confidence Corridors for Multivariate Generalized Quantile Regression
We focus on the construction of confidence corridors for multivariate
nonparametric generalized quantile regression functions. This construction is
based on asymptotic results for the maximal deviation between a suitable
nonparametric estimator and the true function of interest which follow after a
series of approximation steps including a Bahadur representation, a new strong
approximation theorem and exponential tail inequalities for Gaussian random
fields. As a byproduct we also obtain confidence corridors for the regression
function in the classical mean regression. In order to deal with the problem of
slowly decreasing error in coverage probability of the asymptotic confidence
corridors, which results in meager coverage for small sample sizes, a simple
bootstrap procedure is designed based on the leading term of the Bahadur
representation. The finite sample properties of both procedures are
investigated by means of a simulation study and it is demonstrated that the
bootstrap procedure considerably outperforms the asymptotic bands in terms of
coverage accuracy. Finally, the bootstrap confidence corridors are used to
study the efficacy of the National Supported Work Demonstration, which is a
randomized employment enhancement program launched in the 1970s. This article
has supplementary materials
Adaptive robust variable selection
Heavy-tailed high-dimensional data are commonly encountered in various
scientific fields and pose great challenges to modern statistical analysis. A
natural procedure to address this problem is to use penalized quantile
regression with weighted -penalty, called weighted robust Lasso
(WR-Lasso), in which weights are introduced to ameliorate the bias problem
induced by the -penalty. In the ultra-high dimensional setting, where the
dimensionality can grow exponentially with the sample size, we investigate the
model selection oracle property and establish the asymptotic normality of the
WR-Lasso. We show that only mild conditions on the model error distribution are
needed. Our theoretical results also reveal that adaptive choice of the weight
vector is essential for the WR-Lasso to enjoy these nice asymptotic properties.
To make the WR-Lasso practically feasible, we propose a two-step procedure,
called adaptive robust Lasso (AR-Lasso), in which the weight vector in the
second step is constructed based on the -penalized quantile regression
estimate from the first step. This two-step procedure is justified
theoretically to possess the oracle property and the asymptotic normality.
Numerical studies demonstrate the favorable finite-sample performance of the
AR-Lasso.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1191 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …