Search CORE

10,416 research outputs found

Multiple imputation for sharing precise geographies in public use data

Author: Reiter Jerome P.
Wang Hao
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 19/03/2012
Field of study

When releasing data to the public, data stewards are ethically and often legally obligated to protect the confidentiality of data subjects' identities and sensitive attributes. They also strive to release data that are informative for a wide range of secondary analyses. Achieving both objectives is particularly challenging when data stewards seek to release highly resolved geographical information. We present an approach for protecting the confidentiality of data with geographic identifiers based on multiple imputation. The basic idea is to convert geography to latitude and longitude, estimate a bivariate response model conditional on attributes, and simulate new latitude and longitude values from these models. We illustrate the proposed methods using data describing causes of death in Durham, North Carolina. In the context of the application, we present a straightforward tool for generating simulated geographies and attributes based on regression trees, and we present methods for assessing disclosure risks with such simulated data.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS506 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality

Author: Drechsler Jörg
Reiter Jerome P.
Publication venue
Publication date
Field of study

"To protect the cofidentiality of survey respondents' identities and sensitive attributes, statistical agencies can release data in which cofidential values are replaced with multiple imputations. These are called synthetic data. We propose a two-stage approach to generating synthetic data that enables agencies to release different numbers of imputations for different variables. Generation in two stages can reduce computational burdens, decrease disclosure risk, and increase inferential accuracy relative to generation in one stage. We present methods for obtaining inferences from such data. We describe the application of two stage synthesis to creating a public use file for a German business database." (Author's abstract, IAB-Doku) ((en))IAB-Betriebspanel, Datenaufbereitung, Datenanonymisierung, Datenschutz, angewandte Statistik, statistische Methode, Arbeitsmarktforschung, Imputationsverfahren

Research Papers in Economics

A Survey of Irradiated Pillars, Globules, and Jets in the Carina Nebul

Author: Bally J.
Hartigan P.
Reiter M.
Smith N.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2015
Field of study

We present wide-field, deep narrowband H

_2

, Br

\gamma

, H

\alpha

, [S II], [O III], and broadband I and K-band images of the Carina star formation region. The new images provide a large-scale overview of all the H

_2

and Br

\gamma

emission present in over a square degree centered on this signature star forming complex. By comparing these images with archival HST and Spitzer images we observe how intense UV radiation from O and B stars affects star formation in molecular clouds. We use the images to locate new candidate outflows and identify the principal shock waves and irradiated interfaces within dozens of distinct areas of star-forming activity. Shocked molecular gas in jets traces the parts of the flow that are most shielded from the intense UV radiation. Combining the H

_2

and optical images gives a more complete view of the jets, which are sometimes only visible in H

_2

. The Carina region hosts several compact young clusters, and the gas within these clusters is affected by radiation from both the cluster stars and the massive stars nearby. The Carina Nebula is ideal for studying the physics of young H II regions and PDR's, as it contains multiple examples of walls and irradiated pillars at various stages of development. Some of the pillars have detached from their host molecular clouds to form proplyds. Fluorescent H

_2

outlines the interfaces between the ionized and molecular gas, and after removing continuum, we detect spatial offsets between the Br

\gamma

and H

_2

emission along the irradiated interfaces. These spatial offsets can be used to test current models of PDRs once synthetic maps of these lines become available.Comment: Accepted in the Astronomical Journa

arXiv.org e-Print Archive

Crossref

Rice University Research Repository

Estimating propensity scores with missing covariate data using general location mixture models

Author: Mitra Robin
Reiter Jerome P.
Publication venue: Southampton Statistical Sciences Reseach Institute
Publication date: 04/08/2009
Field of study

In many observational studies, researchers estimate causal effects using propensity scores, e.g., by matching or sub-classifying on the scores. Estimation of propensity scores is complicated when some values of the covariates aremissing. We propose to use multiple imputation to create completed datasets, from which propensity scores can be estimated, with a general location mixture model. The model assumes that the control units are a latent mixture of (i)units whose covariates are drawn from the same distributions as the treated units’ covariates and (ii) units whose covariates are drawn from different distributions. This formulation reduces the influence of control units outside the treated units’ region of the covariate space on the estimation of parameters in the imputation model, which can result in more plausible imputations and better balance in the true covariate distributions. We illustrate the benefits of 1 the latent class modeling approach with simulations and with an observationalstudy of the effect of breast feeding on children’s cognitive abilities

Southampton (e-Prints Soton)

Estimating propensity scores with missing covariate data using general location mixture models

Author: Mitra Robin
Reiter Jerome P.
Publication venue: Southampton Statistical Sciences Reseach Institute
Publication date: 01/08/2009
Field of study

Southampton (e-Prints Soton)

Lancaster E-Prints