17,787 research outputs found
Bayesian Estimation Under Informative Sampling
Bayesian analysis is increasingly popular for use in social science and other
application areas where the data are observations from an informative sample.
An informative sampling design leads to inclusion probabilities that are
correlated with the response variable of interest. Model inference performed on
the observed sample taken from the population will be biased for the population
generative model under informative sampling since the balance of information in
the sample data is different from that for the population. Typical approaches
to account for an informative sampling design under Bayesian estimation are
often difficult to implement because they require re-parameterization of the
hypothesized generating model, or focus on design, rather than model-based,
inference. We propose to construct a pseudo-posterior distribution that
utilizes sampling weights based on the marginal inclusion probabilities to
exponentiate the likelihood contribution of each sampled unit, which weights
the information in the sample back to the population. Our approach provides a
nearly automated estimation procedure applicable to any model specified by the
data analyst for the population and retains the population model
parameterization and posterior sampling geometry. We construct conditions on
known marginal and pairwise inclusion probabilities that define a class of
sampling designs where consistency of the pseudo posterior is
guaranteed. We demonstrate our method on an application concerning the Bureau
of Labor Statistics Job Openings and Labor Turnover Survey.Comment: 24 pages, 3 figure
Missing.... presumed at random: cost-analysis of incomplete data
When collecting patient-level resource use data for statistical analysis, for some patients and in some categories of resource use, the required count will not be observed. Although this problem must arise in most reported economic evaluations containing patient-level data, it is rare for authors to detail how the problem was overcome. Statistical packages may default to handling missing data through a so-called complete case analysis, while some recent cost-analyses have appeared to favour an available case approach. Both of these methods are problematic: complete case analysis is inefficient and is likely to be biased; available case analysis, by employing different numbers of observations for each resource use item, generates severe problems for standard statistical inference. Instead we explore imputation methods for generating replacement values for missing data that will permit complete case analysis using the whole data set and we illustrate these methods using two data sets that had incomplete resource use information
Semiparametric Regression Analysis of Panel Count Data: A Practical Review
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/149207/1/insr12271_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/149207/2/insr12271.pd
Fixed Effect Estimation of Large T Panel Data Models
This article reviews recent advances in fixed effect estimation of panel data
models for long panels, where the number of time periods is relatively large.
We focus on semiparametric models with unobserved individual and time effects,
where the distribution of the outcome variable conditional on covariates and
unobserved effects is specified parametrically, while the distribution of the
unobserved effects is left unrestricted. Compared to existing reviews on long
panels (Arellano and Hahn 2007; a section in Arellano and Bonhomme 2011) we
discuss models with both individual and time effects, split-panel Jackknife
bias corrections, unbalanced panels, distribution and quantile effects, and
other extensions. Understanding and correcting the incidental parameter bias
caused by the estimation of many fixed effects is our main focus, and the
unifying theme is that the order of this bias is given by the simple formula
p/n for all models discussed, with p the number of estimated parameters and n
the total sample size.Comment: 40 pages, 1 tabl
Multivariate space-time modelling of multiple air pollutants and their health effects accounting for exposure uncertainty
The long-term health effects of air pollution are often estimated using a spatio-temporal ecological areal unit study, but this design leads to the following statistical challenges: (1) how to estimate spatially representative pollution concentrations for each areal unit; (2) how to allow for the uncertainty in these estimated concentrations when estimating their health effects; and (3) how to simultaneously estimate the joint effects of multiple correlated pollutants. This article proposes a novel 2-stage Bayesian hierarchical model for addressing these 3 challenges, with inference based on Markov chain Monte Carlo simulation. The first stage is a multivariate spatio-temporal fusion model for predicting areal level average concentrations of multiple pollutants from both monitored and modelled pollution data. The second stage is a spatio-temporal model for estimating the health impact of multiple correlated pollutants simultaneously, which accounts for the uncertainty in the estimated pollution concentrations. The novel methodology is motivated by a new study of the impact of both particulate matter and nitrogen dioxide concentrations on respiratory hospital admissions in Scotland between 2007 and 2011, and the results suggest that both pollutants exhibit substantial and independent health effects
Time Series of Count Data : Modelling and Estimation
This paper compares various models for time series of counts which can account for discreetness, overdispersion and serial correlation. Besides observation- and parameter-driven models based upon corresponding conditional Poisson distributions, we also consider a dynamic ordered probit model as a flexible specification to capture the salient features of time series of counts. For all models, we present appropriate efficient estimation procedures. For parameter-driven specifications this requires Monte Carlo procedures like simulated Maximum likelihood or Markov Chain Monte-Carlo. The methods including corresponding diagnostic tests are illustrated with data on daily admissions for asthma to a single hospital. --Efficient Importance Sampling,GLARMA,Markov Chain Monte-Carlo,Observation-driven model,Parameter-driven model,Ordered Probit
- …