Search CORE

41,483 research outputs found

Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources

Author: Greenland Sander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/01/2010
Field of study

In designed experiments and surveys, known laws or design feat ures provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks. The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.Comment: Published in at http://dx.doi.org/10.1214/09-STS291 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Mortality modelling and forecasting: a review of methods

Author: Booth Heather
Tickle Leonie
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 09/12/2015
Field of study

The Australian National University

Assessing the disclosure protection provided by misclassification for survey microdata

Author: Shlomo Natalie
Skinner Chris
Publication venue: Southampton Statistical Sciences Reseach Institute
Publication date: 07/08/2009
Field of study

Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect confidentiality. There is a need for ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping and the post randomisation method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other non-sampling errors commonly arising in surveys

Southampton (e-Prints Soton)

A New Method for Protecting Interrelated Time Series with Bayesian Prior Distributions and Synthetic Data

Author: Abowd John M
Schneider Matthew J
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2013
Field of study

Organizations disseminate statistical summaries of administrative data via the Web for unrestricted public use. They balance the trade-off between confidentiality protection and inference quality. Recent developments in disclosure avoidance techniques include the incorporation of synthetic data, which capture the essential features of underlying data by releasing altered data generated from a posterior predictive distribution. The United States Census Bureau collects millions of interrelated time series micro-data that are hierarchical and contain many zeros and suppressions. Rule-based disclosure avoidance techniques often require the suppression of count data for small magnitudes and the modification of data based on a small number of entities. Motivated by this problem, we use zero-inflated extensions of Bayesian Generalized Linear Mixed Models (BGLMM) with privacy-preserving prior distributions to develop methods for protecting and releasing synthetic data from time series about thousands of small groups of entities without suppression based on the of magnitudes or number of entities. We find that as the prior distributions of the variance components in the BGLMM become more precise toward zero, confidentiality protection increases and inference quality deteriorates. We evaluate our methodology using a strict privacy measure, empirical differential privacy, and a newly defined risk measure, Probability of Range Identification (PoRI), which directly measures attribute disclosure risk. We illustrate our results with the U.S. Census Bureau’s Quarterly Workforce Indicators

CiteSeerX

DigitalCommons@ILR

Using prior information to identify boundaries in disease risk maps

Author: Lee Duncan
Publication venue
Publication date: 01/01/2012
Field of study

Disease maps display the spatial pattern in disease risk, so that high-risk clusters can be identified. The spatial structure in the risk map is typically represented by a set of random effects, which are modelled with a conditional autoregressive (CAR) prior. Such priors include a global spatial smoothing parameter, whereas real risk surfaces are likely to include areas of smooth evolution as well as discontinuities, the latter of which are known as risk boundaries. Therefore, this paper proposes an extension to the class of CAR priors, which can identify both areas of localised spatial smoothness and risk boundaries. However, allowing for this localised smoothing requires large numbers of correlation parameters to be estimated, which are unlikely to be well identified from the data. To address this problem we propose eliciting an informative prior about the locations of such boundaries, which can be combined with the information from the data to provide more precise posterior inference. We test our approach by simulation, before applying it to a study of the risk of emergency admission to hospital in Greater Glasgow, Scotland

arXiv.org e-Print Archive

CiteSeerX

A comparative study of parametric mortality projection models

Author: Butt Z.
Haberman S.
Publication venue: Faculty of Actuarial Science & Insurance, City University London
Publication date: 01/01/2010
Field of study

The relative merits of different parametric models for making life expectancy and annuity value predictions at both pensioner and adult ages are investigated. This study builds on current published research and considers recent model enhancements and the extent to which these enhancements address the deficiencies that have been identified of some of the models. The England & Wales male mortality experience is used to conduct detailed comparisons at pensioner ages, having first established a common basis for comparison across all models. The model comparison is then extended to include the England & Wales female experience and both the male and female USA mortality experiences over a wider age range, encompassing also the working ages

CiteSeerX

City Research Online

Sharp sensitivity bounds for mediation under unmeasured mediator-outcome confounding

Author: Ding Peng
VanderWeele Tyler J.
Publication venue
Publication date: 19/01/2016
Field of study

It is often of interest to decompose a total effect of an exposure into the component that acts on the outcome through some mediator and the component that acts independently through other pathways. Said another way, we are interested in the direct and indirect effects of the exposure on the outcome. Even if the exposure is randomly assigned, it is often infeasible to randomize the mediator, leaving the mediator-outcome confounding not fully controlled. We develop a sensitivity analysis technique that can bound the direct and indirect effects without parametric assumptions about the unmeasured mediator-outcome confounding

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

Structural Nested Models and G-estimation: The Partially Realized Promise

Author: Joffe Marshall
Vansteelandt Stijn
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

Structural nested models (SNMs) and the associated method of G-estimation were first proposed by James Robins over two decades ago as approaches to modeling and estimating the joint effects of a sequence of treatments or exposures. The models and estimation methods have since been extended to dealing with a broader series of problems, and have considerable advantages over the other methods developed for estimating such joint effects. Despite these advantages, the application of these methods in applied research has been relatively infrequent; we view this as unfortunate. To remedy this, we provide an overview of the models and estimation methods as developed, primarily by Robins, over the years. We provide insight into their advantages over other methods, and consider some possible reasons for failure of the methods to be more broadly adopted, as well as possible remedies. Finally, we consider several extensions of the standard models and estimation methods.Comment: Published in at http://dx.doi.org/10.1214/14-STS493 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Recommended from our members

llc: a collection of R functions for fitting a class of Lee-Carter mortality models using iterative fitting algorithms

Author: Butt Z.
Haberman S.
Publication venue: Faculty of Actuarial Science & Insurance, City University London
Publication date: 01/01/2009
Field of study

We implement a specialised iterative regression methodology in R for the analysis of age-period mortality data based on a class of generalised Lee-Carter (LC) type modelling structures. The LC-based modelling frameworks is viewed in the current literature as among the most efficient and transparent methods of modelling and projecting mortality improvements. Thus, we make use of the modelling approach discussed in Renshaw and Haberman (2006), which extends the basic LC model and proposes to make use of a tailored iterative process to generate parameter estimates based on Poisson likelihood. Furthermore, building on this methodology we develop and implement a stratified LC model for the measurement of the additive effect on the log scale of an explanatory factor (other than age and time). This modelling methodology is implemented in a publically available collection of programming functions that facilitate both the preparation of mortality data and the fitting and analysis of the given log-linear modelling structures. Also, the package incorporates methods to produce forecasts of future mortality rates and to compute the corresponding future life expectancy

City Research Online