344 research outputs found
Robust Estimation and Wavelet Thresholding in Partial Linear Models
This paper is concerned with a semiparametric partially linear regression
model with unknown regression coefficients, an unknown nonparametric function
for the non-linear component, and unobservable Gaussian distributed random
errors. We present a wavelet thresholding based estimation procedure to
estimate the components of the partial linear model by establishing a
connection between an -penalty based wavelet estimator of the
nonparametric component and Huber's M-estimation of a standard linear model
with outliers. Some general results on the large sample properties of the
estimates of both the parametric and the nonparametric part of the model are
established. Simulations and a real example are used to illustrate the general
results and to compare the proposed methodology with other methods available in
the recent literature
Properties of principal component methods for functional and longitudinal data analysis
The use of principal component methods to analyze functional data is
appropriate in a wide range of different settings. In studies of ``functional
data analysis,'' it has often been assumed that a sample of random functions is
observed precisely, in the continuum and without noise. While this has been the
traditional setting for functional data analysis, in the context of
longitudinal data analysis a random function typically represents a patient, or
subject, who is observed at only a small number of randomly distributed points,
with nonnegligible measurement error. Nevertheless, essentially the same
methods can be used in both these cases, as well as in the vast number of
settings that lie between them. How is performance affected by the sampling
plan? In this paper we answer that question. We show that if there is a sample
of functions, or subjects, then estimation of eigenvalues is a
semiparametric problem, with root- consistent estimators, even if only a few
observations are made of each function, and if each observation is encumbered
by noise. However, estimation of eigenfunctions becomes a nonparametric problem
when observations are sparse. The optimal convergence rates in this case are
those which pertain to more familiar function-estimation settings. We also
describe the effects of sampling at regularly spaced points, as opposed to
random points. In particular, it is shown that there are often advantages in
sampling randomly. However, even in the case of noisy data there is a threshold
sampling rate (depending on the number of functions treated) above which the
rate of sampling (either randomly or regularly) has negligible impact on
estimator performance, no matter whether eigenfunctions or eigenvectors are
being estimated.Comment: Published at http://dx.doi.org/10.1214/009053606000000272 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Automatic Debiased Machine Learning of Causal and Structural Effects
Many causal and structural effects depend on regressions. Examples include
average treatment effects, policy effects, average derivatives, regression
decompositions, economic average equivalent variation, and parameters of
economic structural models. The regressions may be high dimensional. Plugging
machine learners into identifying equations can lead to poor inference due to
bias and/or model selection. This paper gives automatic debiasing for
estimating equations and valid asymptotic inference for the estimators of
effects of interest. The debiasing is automatic in that its construction uses
the identifying equations without the full form of the bias correction and is
performed by machine learning. Novel results include convergence rates for
Lasso and Dantzig learners of the bias correction, primitive conditions for
asymptotic inference for important examples, and general conditions for GMM. A
variety of regression learners and identifying equations are covered. Automatic
debiased machine learning (Auto-DML) is applied to estimating the average
treatment effect on the treated for the NSW job training data and to estimating
demand elasticities from Nielsen scanner data while allowing preferences to be
correlated with prices and income
Recommended from our members
New Developments in Functional and Highly Multivariate Statistical Methodology
The central focus of the workshop was on recent developments in statistical techniques for highly multivariate data and functional data. The programme delivered talks on state-of-the-art research in the area, with a number of talks on highly-dimensional multivariate settings as well as talks dealing with functional data. The talks were followed by lively discussions on how to tackle difficult issues in the statistical methodology for such complex data
A Selective Review of Group Selection in High-Dimensional Models
Grouping structures arise naturally in many statistical modeling problems.
Several methods have been proposed for variable selection that respect grouping
structure in variables. Examples include the group LASSO and several concave
group selection methods. In this article, we give a selective review of group
selection concerning methodological developments, theoretical properties and
computational algorithms. We pay particular attention to group selection
methods involving concave penalties. We address both group selection and
bi-level selection methods. We describe several applications of these methods
in nonparametric additive models, semiparametric regression, seemingly
unrelated regressions, genomic data analysis and genome wide association
studies. We also highlight some issues that require further study.Comment: Published in at http://dx.doi.org/10.1214/12-STS392 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Impossibility Results for Nondifferentiable Functionals
We examine challenges to estimation and inference when the objects of interest are nondifferentiable functionals of the underlying data distribution. This situation arises in a number of applications of bounds analysis and moment inequality models, and in recent work on estimating optimal dynamic treatment regimes. Drawing on earlier work relating differentiability to the existence of unbiased and regular estimators, we show that if the target object is not continuously differentiable in the parameters of the data distribution, there exist no locally asymptotically unbiased estimators and no regular estimators. This places strong limits on estimators, bias correction methods, and inference procedures.bounds analysis; moment inequality models; treatment effects; limits of experiments
- …