26,838 research outputs found

    Population Intervention Models in Causal Inference

    Get PDF
    Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a] treatment variable or risk variable on the distribution of a disease in a population. These models, as originally introduced by Robins (e.g., Robins (2000a), Robins (2000b), van der Laan and Robins (2002)), model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates, and its dependence on treatment. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject\u27s treatment and covariate history are measured over time, and an outcome is recorded at a final time point. In addition to the simpler, weighted regression approaches (inverse probability of treatment weighted estimators), more general (and robust) estimators have been developed and studied in detail for standard MSM (Robins (2000b), Neugebauer and van der Laan (2004), Yu and van der Laan (2003), van der Laan and Robins (2002)). In this paper we argue that in many applications one is interested in modeling the difference between a treatment-specific counterfactual population distribution and the actual population distribution of the target population of interest. Relevant parameters describe the effect of a hypothetical intervention on such a population, and therefore we refer to these models as intervention models. We focus on intervention models estimating the effect on an intervention in terms of a difference of means, ratio in means (e.g., relative risk if the outcome is binary), a so called switch relative risk for binary outcomes, and difference in entire distributions as measured by the quantile-quantile function. In addition, we provide a class of inverse probability of treatment weighed estimators, and double robust estimators of the causal parameters in these models. We illustrate the finite sample performance of these new estimators in a simulation study

    Nonlinear shrinkage estimation of large-dimensional covariance matrices

    Full text link
    Many statistical applications require an estimate of a covariance matrix and/or its inverse. When the matrix dimension is large compared to the sample size, which happens frequently, the sample covariance matrix is known to perform poorly and may suffer from ill-conditioning. There already exists an extensive literature concerning improved estimators in such situations. In the absence of further knowledge about the structure of the true covariance matrix, the most successful approach so far, arguably, has been shrinkage estimation. Shrinking the sample covariance matrix to a multiple of the identity, by taking a weighted average of the two, turns out to be equivalent to linearly shrinking the sample eigenvalues to their grand mean, while retaining the sample eigenvectors. Our paper extends this approach by considering nonlinear transformations of the sample eigenvalues. We show how to construct an estimator that is asymptotically equivalent to an oracle estimator suggested in previous work. As demonstrated in extensive Monte Carlo simulations, the resulting bona fide estimator can result in sizeable improvements over the sample covariance matrix and also over linear shrinkage.Comment: Published in at http://dx.doi.org/10.1214/12-AOS989 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric Additive Model-assisted Estimation for Survey Data

    Get PDF
    An additive model-assisted nonparametric method is investigated to estimate the finite population totals of massive survey data with the aid of auxiliary information. A class of estimators is proposed to improve the precision of the well known Horvitz-Thompson estimators by combining the spline and local polynomial smoothing methods. These estimators are calibrated, asymptotically design-unbiased, consistent, normal and robust in the sense of asymptotically attaining the Godambe-Joshi lower bound to the anticipated variance. A consistent model selection procedure is further developed to select the significant auxiliary variables. The proposed method is sufficiently fast to analyze large survey data of high dimension within seconds. The performance of the proposed method is assessed empirically via simulation studies

    Empirical likelihood confidence intervals for complex sampling designs

    No full text
    We define an empirical likelihood approach which gives consistent design-based confidence intervals which can be calculated without the need of variance estimates, design effects, resampling, joint inclusion probabilities and linearization, even when the point estimator is not linear. It can be used to construct confidence intervals for a large class of sampling designs and estimators which are solutions of estimating equations. It can be used for means, regressions coefficients, quantiles, totals or counts even when the population size is unknown. It can be used with large sampling fractions and naturally includes calibration constraints. It can be viewed as an extension of the empirical likelihood approach to complex survey data. This approach is computationally simpler than the pseudoempirical likelihood and the bootstrap approaches. The simulation study shows that the confidence interval proposed may give better coverages than the confidence intervals based on linearization, bootstrap and pseudoempirical likelihood. Our simulation study shows that, under complex sampling designs, standard confidence intervals based on normality may have poor coverages, because point estimators may not follow a normal sampling distribution and their variance estimators may be biased.<br/

    Spectrum Estimation: A Unified Framework for Covariance Matrix Estimation and PCA in Large Dimensions

    Full text link
    Covariance matrix estimation and principal component analysis (PCA) are two cornerstones of multivariate analysis. Classic textbook solutions perform poorly when the dimension of the data is of a magnitude similar to the sample size, or even larger. In such settings, there is a common remedy for both statistical problems: nonlinear shrinkage of the eigenvalues of the sample covariance matrix. The optimal nonlinear shrinkage formula depends on unknown population quantities and is thus not available. It is, however, possible to consistently estimate an oracle nonlinear shrinkage, which is motivated on asymptotic grounds. A key tool to this end is consistent estimation of the set of eigenvalues of the population covariance matrix (also known as the spectrum), an interesting and challenging problem in its own right. Extensive Monte Carlo simulations demonstrate that our methods have desirable finite-sample properties and outperform previous proposals.Comment: 40 pages, 8 figures, 5 tables, University of Zurich, Department of Economics, Working Paper No. 105, Revised version, July 201
    corecore