344 research outputs found

    Robust Estimation and Wavelet Thresholding in Partial Linear Models

    Full text link
    This paper is concerned with a semiparametric partially linear regression model with unknown regression coefficients, an unknown nonparametric function for the non-linear component, and unobservable Gaussian distributed random errors. We present a wavelet thresholding based estimation procedure to estimate the components of the partial linear model by establishing a connection between an l1l_1-penalty based wavelet estimator of the nonparametric component and Huber's M-estimation of a standard linear model with outliers. Some general results on the large sample properties of the estimates of both the parametric and the nonparametric part of the model are established. Simulations and a real example are used to illustrate the general results and to compare the proposed methodology with other methods available in the recent literature

    Properties of principal component methods for functional and longitudinal data analysis

    Full text link
    The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of ``functional data analysis,'' it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has been the traditional setting for functional data analysis, in the context of longitudinal data analysis a random function typically represents a patient, or subject, who is observed at only a small number of randomly distributed points, with nonnegligible measurement error. Nevertheless, essentially the same methods can be used in both these cases, as well as in the vast number of settings that lie between them. How is performance affected by the sampling plan? In this paper we answer that question. We show that if there is a sample of nn functions, or subjects, then estimation of eigenvalues is a semiparametric problem, with root-nn consistent estimators, even if only a few observations are made of each function, and if each observation is encumbered by noise. However, estimation of eigenfunctions becomes a nonparametric problem when observations are sparse. The optimal convergence rates in this case are those which pertain to more familiar function-estimation settings. We also describe the effects of sampling at regularly spaced points, as opposed to random points. In particular, it is shown that there are often advantages in sampling randomly. However, even in the case of noisy data there is a threshold sampling rate (depending on the number of functions treated) above which the rate of sampling (either randomly or regularly) has negligible impact on estimator performance, no matter whether eigenfunctions or eigenvectors are being estimated.Comment: Published at http://dx.doi.org/10.1214/009053606000000272 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Automatic Debiased Machine Learning of Causal and Structural Effects

    Full text link
    Many causal and structural effects depend on regressions. Examples include average treatment effects, policy effects, average derivatives, regression decompositions, economic average equivalent variation, and parameters of economic structural models. The regressions may be high dimensional. Plugging machine learners into identifying equations can lead to poor inference due to bias and/or model selection. This paper gives automatic debiasing for estimating equations and valid asymptotic inference for the estimators of effects of interest. The debiasing is automatic in that its construction uses the identifying equations without the full form of the bias correction and is performed by machine learning. Novel results include convergence rates for Lasso and Dantzig learners of the bias correction, primitive conditions for asymptotic inference for important examples, and general conditions for GMM. A variety of regression learners and identifying equations are covered. Automatic debiased machine learning (Auto-DML) is applied to estimating the average treatment effect on the treated for the NSW job training data and to estimating demand elasticities from Nielsen scanner data while allowing preferences to be correlated with prices and income

    A Selective Review of Group Selection in High-Dimensional Models

    Full text link
    Grouping structures arise naturally in many statistical modeling problems. Several methods have been proposed for variable selection that respect grouping structure in variables. Examples include the group LASSO and several concave group selection methods. In this article, we give a selective review of group selection concerning methodological developments, theoretical properties and computational algorithms. We pay particular attention to group selection methods involving concave penalties. We address both group selection and bi-level selection methods. We describe several applications of these methods in nonparametric additive models, semiparametric regression, seemingly unrelated regressions, genomic data analysis and genome wide association studies. We also highlight some issues that require further study.Comment: Published in at http://dx.doi.org/10.1214/12-STS392 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Impossibility Results for Nondifferentiable Functionals

    Get PDF
    We examine challenges to estimation and inference when the objects of interest are nondifferentiable functionals of the underlying data distribution. This situation arises in a number of applications of bounds analysis and moment inequality models, and in recent work on estimating optimal dynamic treatment regimes. Drawing on earlier work relating differentiability to the existence of unbiased and regular estimators, we show that if the target object is not continuously differentiable in the parameters of the data distribution, there exist no locally asymptotically unbiased estimators and no regular estimators. This places strong limits on estimators, bias correction methods, and inference procedures.bounds analysis; moment inequality models; treatment effects; limits of experiments
    corecore