17,787 research outputs found

    Bayesian Estimation Under Informative Sampling

    Full text link
    Bayesian analysis is increasingly popular for use in social science and other application areas where the data are observations from an informative sample. An informative sampling design leads to inclusion probabilities that are correlated with the response variable of interest. Model inference performed on the observed sample taken from the population will be biased for the population generative model under informative sampling since the balance of information in the sample data is different from that for the population. Typical approaches to account for an informative sampling design under Bayesian estimation are often difficult to implement because they require re-parameterization of the hypothesized generating model, or focus on design, rather than model-based, inference. We propose to construct a pseudo-posterior distribution that utilizes sampling weights based on the marginal inclusion probabilities to exponentiate the likelihood contribution of each sampled unit, which weights the information in the sample back to the population. Our approach provides a nearly automated estimation procedure applicable to any model specified by the data analyst for the population and retains the population model parameterization and posterior sampling geometry. We construct conditions on known marginal and pairwise inclusion probabilities that define a class of sampling designs where L1L_{1} consistency of the pseudo posterior is guaranteed. We demonstrate our method on an application concerning the Bureau of Labor Statistics Job Openings and Labor Turnover Survey.Comment: 24 pages, 3 figure

    Missing.... presumed at random: cost-analysis of incomplete data

    Get PDF
    When collecting patient-level resource use data for statistical analysis, for some patients and in some categories of resource use, the required count will not be observed. Although this problem must arise in most reported economic evaluations containing patient-level data, it is rare for authors to detail how the problem was overcome. Statistical packages may default to handling missing data through a so-called complete case analysis, while some recent cost-analyses have appeared to favour an available case approach. Both of these methods are problematic: complete case analysis is inefficient and is likely to be biased; available case analysis, by employing different numbers of observations for each resource use item, generates severe problems for standard statistical inference. Instead we explore imputation methods for generating replacement values for missing data that will permit complete case analysis using the whole data set and we illustrate these methods using two data sets that had incomplete resource use information

    Semiparametric Regression Analysis of Panel Count Data: A Practical Review

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/149207/1/insr12271_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/149207/2/insr12271.pd

    Fixed Effect Estimation of Large T Panel Data Models

    Get PDF
    This article reviews recent advances in fixed effect estimation of panel data models for long panels, where the number of time periods is relatively large. We focus on semiparametric models with unobserved individual and time effects, where the distribution of the outcome variable conditional on covariates and unobserved effects is specified parametrically, while the distribution of the unobserved effects is left unrestricted. Compared to existing reviews on long panels (Arellano and Hahn 2007; a section in Arellano and Bonhomme 2011) we discuss models with both individual and time effects, split-panel Jackknife bias corrections, unbalanced panels, distribution and quantile effects, and other extensions. Understanding and correcting the incidental parameter bias caused by the estimation of many fixed effects is our main focus, and the unifying theme is that the order of this bias is given by the simple formula p/n for all models discussed, with p the number of estimated parameters and n the total sample size.Comment: 40 pages, 1 tabl

    Multivariate space-time modelling of multiple air pollutants and their health effects accounting for exposure uncertainty

    Get PDF
    The long-term health effects of air pollution are often estimated using a spatio-temporal ecological areal unit study, but this design leads to the following statistical challenges: (1) how to estimate spatially representative pollution concentrations for each areal unit; (2) how to allow for the uncertainty in these estimated concentrations when estimating their health effects; and (3) how to simultaneously estimate the joint effects of multiple correlated pollutants. This article proposes a novel 2-stage Bayesian hierarchical model for addressing these 3 challenges, with inference based on Markov chain Monte Carlo simulation. The first stage is a multivariate spatio-temporal fusion model for predicting areal level average concentrations of multiple pollutants from both monitored and modelled pollution data. The second stage is a spatio-temporal model for estimating the health impact of multiple correlated pollutants simultaneously, which accounts for the uncertainty in the estimated pollution concentrations. The novel methodology is motivated by a new study of the impact of both particulate matter and nitrogen dioxide concentrations on respiratory hospital admissions in Scotland between 2007 and 2011, and the results suggest that both pollutants exhibit substantial and independent health effects

    Time Series of Count Data : Modelling and Estimation

    Get PDF
    This paper compares various models for time series of counts which can account for discreetness, overdispersion and serial correlation. Besides observation- and parameter-driven models based upon corresponding conditional Poisson distributions, we also consider a dynamic ordered probit model as a flexible specification to capture the salient features of time series of counts. For all models, we present appropriate efficient estimation procedures. For parameter-driven specifications this requires Monte Carlo procedures like simulated Maximum likelihood or Markov Chain Monte-Carlo. The methods including corresponding diagnostic tests are illustrated with data on daily admissions for asthma to a single hospital. --Efficient Importance Sampling,GLARMA,Markov Chain Monte-Carlo,Observation-driven model,Parameter-driven model,Ordered Probit
    corecore