Search CORE

201 research outputs found

Partially-Latent Class Models (pLCM) for Case-Control Studies of Childhood Pneumonia Etiology

Author: Deloria-Knoll Maria
Hammitt Laura L.
Wu Zhenke
Zeger Scott L.
Publication venue
Publication date: 31/05/2014
Field of study

In population studies on the etiology of disease, one goal is the estimation of the fraction of cases attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal, or other pathogens. The study of pneumonia etiology is challenging because directly sampling from the lung to identify the etiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. This paper introduces the statistical methodology designed for estimating the population etiology distribution and the individual etiology probabilities in the Pneumonia Etiology Research for Child Health (PERCH) study of 9; 500 children for 7 sites around the world. We formulate the scientific problem in statistical terms as estimating the mixing weights and latent class indicators under a partially-latent class model (pLCM) that combines heterogeneous measurements with different error rates obtained from a case-control study. We introduce the pLCM as an extension of the latent class model. We also introduce graphical displays of the population data and inferred latent-class frequencies. The methods are tested with simulated data, and then applied to PERCH data. The paper closes with a brief description of extensions of the pLCM to the regression setting and to the case where conditional independence among the measures is relaxed.Comment: 25 pages, 4 figures, 1 supplementary materia

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

ON THE EQUIVALENCE OF CASE-CROSSOVER AND TIME SERIES METHODS IN ENVIRONMENTAL EPIDEMIOLOGY

Author: Lu Yun
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 13/03/2006
Field of study

Time series and case-crossover methods are often viewed as competing alternatives in environmental epidemiologic studies. Several recent studies have compared the time series and case-crossover methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for over-dispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics

Collection Of Biostatistics Research Archive

DECOMPOSITION OF REGRESSION ESTIMATORS TO EXPLORE THE INFLUENCE OF UNMEASURED TIME-VARYING CONFOUNDERS

Author: Lu Yun
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 21/11/2007
Field of study

In environmental epidemiology, exposure X and health outcome Y vary in space and time. We present a method to diagnose the possible influence of unmeasured confounders U on the estimated effect of X on Y and to propose several approaches to robust estimation. The idea is to use space and time as proxy measures for the unmeasured factors U. We start with the time series case where X and Y are continuous variables at equally-spaced times and assume a linear model. We define matching estimator b(u)s that correspond to pairs of observations with specific lag u. Controlling for a smooth function of time, St, using a kernel estimator is roughly equivalent to estimating the association with a linear combination of the b(u)s with weights that involve two components: the assumptions about the smoothness of St and the normalized variogram of the X process. When an unmeasured confounder U exists, but the model otherwise correctly controls for measured confounders, the excess variation in b(u)s is evidence of confounding by U. We use the plot of b(u)s versus lag u, lagged-estimator-plot (LEP), to diagnose the influence of U on the effect of X on Y. We use appropriate linear combination of b(u)s or extrapolate to b(0) to obtain novel estimators that are more robust to the influence of smooth U. The methods are extended to time series log-linear models and to spatial analyses. The LEP plot gives us a direct view of the magnitude of the estimators for each lag u and provides evidence when models did not adequately describe the data

Collection Of Biostatistics Research Archive

ON MARGINALIZED MULTILEVEL MODELS AND THEIR COMPUTATION

Author: Griswold Michael E.
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 15/11/2004
Field of study

Clustered data analysis is characterized by the need to describe both systematic variation in a mean model and cluster-dependent random variation in an association model. Marginalized multilevel models embrace the robustness and interpretations of a marginal mean model, while retaining the likelihood inference capabilities and flexible dependence structures of a conditional association model. Although there has been increasing recognition of the attractiveness of marginalized multilevel models, there has been a gap in their practical application arising from a lack of readily available estimation procedures. We extend the marginalized multilevel model to allow for nonlinear functions in both the mean and association aspects. We then formulate marginal models through conditional specifications to facilitate estimation with mixed model computational solutions already in place. We illustrate this approach on a cerebrovascular deficiency crossover trial

Collection Of Biostatistics Research Archive

A Spatio-Temporal Approach for Estimating Chronic Effects of Air Pollution

Author: Dominici Francesca
Greven Sonja
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 24/06/2009
Field of study

Estimating the health risks associated with air pollution exposure is of great importance in public health. In air pollution epidemiology, two study designs have been used mainly. Time series studies estimate acute risk associated with short-term exposure. They compare day-to-day variation of pollution concentrations and mortality rates, and have been criticized for potential confounding by time-varying covariates. Cohort studies estimate chronic effects associated with long-term exposure. They compare long-term average pollution concentrations and time-to-death across cities, and have been criticized for potential confounding by individual risk factors or city-level characteristics. We propose a new study design and a statistical model, which use spatio-temporal information to estimate the long-term effects of air pollution exposure on life expectancy. Our approach avoids confounding by time-varying covariates and individual or city-level risk factors. By estimating the increase in life expectancy due to decreases in long-term air pollution concentrations, it provides easily interpretable values for public policy purposes. We develop a suitable backfitting algorithm that permits efficient fitting of our model to large spatio-temporal data sets. We evaluate spatio-temporal correlation in the data and obtain appropriate standard errors. We apply our methods to the Medicare Cohort Air Pollution Study, including data on fine particulate matter (PM2.5) and mortality for 18.2 million Medicare enrollees from 814 locations in the U.S. during an average of 65 months in 2000-2006. Supplemental material including R code implementing our methods is provided in a web appendix

Collection Of Biostatistics Research Archive

Underestimation of Standard Errors in Multi-Site Time Series Studies

Author: Daniels Michael
Dominici Francesca
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 21/11/2003
Field of study

Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity. Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to be combined. In this paper we investigate the impact of variance underestimation on the pooled relative rate estimate. We focus on two-stage normal-normal hierarchical models and on under- estimation of the statistical variance at the first stage. By mathematical considerations and simulation studies, we found that variance underestimation does not affect the pooled estimate substantially. However, some sensitivity of the pooled estimate to variance underestimation is observed when the number of sites is small and underestimation is severe. These simulation results are applicable to any two-stage normal-normal hierarchical model for combining information of site-specific results, and they can be easily extended to more general hierarchical formulations. We also examined the impact of variance underestimation on the national average relative rate estimate from the National Morbidity Mortality Air Pollution Study and we found that variance underestimation as much as 40% has little effect on the national average

Collection Of Biostatistics Research Archive

On Time Series Analysis of Public Health and Biomedical Data

Author: Irizarry Rafael A
Peng Roger D.
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/09/2004
Field of study

A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research

Collection Of Biostatistics Research Archive

Studying Effects of Primary Care Physicians and Patients on the Trade-Off Between Charges for Primary Care and Specialty Care Using a Hierarchical Multivariate Two-Part Model

Author: Forrest Christopher B.
Robinson John W.
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 26/08/2004
Field of study

Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs\u27 effects on patients\u27 annual charges for two types of services, primary care and specialty care, the associations among PCPs\u27 effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p \u3c .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models

Collection Of Biostatistics Research Archive