Search CORE

7,822 research outputs found

Estimating propensity scores with missing covariate data using general location mixture models

Author: Mitra Robin
Reiter Jerome P.
Publication venue: Southampton Statistical Sciences Reseach Institute
Publication date: 04/08/2009
Field of study

In many observational studies, researchers estimate causal effects using propensity scores, e.g., by matching or sub-classifying on the scores. Estimation of propensity scores is complicated when some values of the covariates aremissing. We propose to use multiple imputation to create completed datasets, from which propensity scores can be estimated, with a general location mixture model. The model assumes that the control units are a latent mixture of (i)units whose covariates are drawn from the same distributions as the treated units’ covariates and (ii) units whose covariates are drawn from different distributions. This formulation reduces the influence of control units outside the treated units’ region of the covariate space on the estimation of parameters in the imputation model, which can result in more plausible imputations and better balance in the true covariate distributions. We illustrate the benefits of 1 the latent class modeling approach with simulations and with an observationalstudy of the effect of breast feeding on children’s cognitive abilities

Southampton (e-Prints Soton)

Estimating propensity scores with missing covariate data using general location mixture models

Author: Mitra Robin
Reiter Jerome P.
Publication venue: Southampton Statistical Sciences Reseach Institute
Publication date: 01/08/2009
Field of study

Southampton (e-Prints Soton)

Lancaster E-Prints

Multiple imputation for sharing precise geographies in public use data

Author: Reiter Jerome P.
Wang Hao
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 19/03/2012
Field of study

When releasing data to the public, data stewards are ethically and often legally obligated to protect the confidentiality of data subjects' identities and sensitive attributes. They also strive to release data that are informative for a wide range of secondary analyses. Achieving both objectives is particularly challenging when data stewards seek to release highly resolved geographical information. We present an approach for protecting the confidentiality of data with geographic identifiers based on multiple imputation. The basic idea is to convert geography to latitude and longitude, estimate a bivariate response model conditional on attributes, and simulate new latitude and longitude values from these models. We illustrate the proposed methods using data describing causes of death in Durham, North Carolina. In the context of the application, we present a straightforward tool for generating simulated geographies and attributes based on regression trees, and we present methods for assessing disclosure risks with such simulated data.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS506 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref