1,651 research outputs found

    A toolkit for measurement error correction, with a focus on nutritional epidemiology.

    Get PDF
    Exposure measurement error is a problem in many epidemiological studies, including those using biomarkers and measures of dietary intake. Measurement error typically results in biased estimates of exposure-disease associations, the severity and nature of the bias depending on the form of the error. To correct for the effects of measurement error, information additional to the main study data is required. Ideally, this is a validation sample in which the true exposure is observed. However, in many situations, it is not feasible to observe the true exposure, but there may be available one or more repeated exposure measurements, for example, blood pressure or dietary intake recorded at two time points. The aim of this paper is to provide a toolkit for measurement error correction using repeated measurements. We bring together methods covering classical measurement error and several departures from classical error: systematic, heteroscedastic and differential error. The correction methods considered are regression calibration, which is already widely used in the classical error setting, and moment reconstruction and multiple imputation, which are newer approaches with the ability to handle differential error. We emphasize practical application of the methods in nutritional epidemiology and other fields. We primarily consider continuous exposures in the exposure-outcome model, but we also outline methods for use when continuous exposures are categorized. The methods are illustrated using the data from a study of the association between fibre intake and colorectal cancer, where fibre intake is measured using a diet diary and repeated measures are available for a subset

    Using full-cohort data in nested case-control and case-cohort studies by multiple imputation.

    No full text
    In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure-disease association studies are therefore often based on nested case-control or case-cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case-control or case-cohort study plus the remainder of the cohort as a full-cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub-studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full-cohort information in the analysis of nested case-control and case-cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter-matching in nested case-control studies and a weighted analysis for case-cohort studies, both of which use some full-cohort information. Approximate imputation models perform well except when there are interactions or non-linear terms in the outcome model, where imputation using rejection sampling works well

    Handling missing data in matched case-control studies using multiple imputation.

    Get PDF
    Analysis of matched case-control studies is often complicated by missing data on covariates. Analysis can be restricted to individuals with complete data, but this is inefficient and may be biased. Multiple imputation (MI) is an efficient and flexible alternative. We describe two MI approaches. The first uses a model for the data on an individual and includes matching variables; the second uses a model for the data on a whole matched set and avoids the need to model the matching variables. Within each approach, we consider three methods: full-conditional specification (FCS), joint model MI using a normal model, and joint model MI using a latent normal model. We show that FCS MI is asymptotically equivalent to joint model MI using a restricted general location model that is compatible with the conditional logistic regression analysis model. The normal and latent normal imputation models are not compatible with this analysis model. All methods allow for multiple partially-observed covariates, non-monotone missingness, and multiple controls per case. They can be easily applied in standard statistical software and valid variance estimates obtained using Rubin's Rules. We compare the methods in a simulation study. The approach of including the matching variables is most efficient. Within each approach, the FCS MI method generally yields the least-biased odds ratio estimates, but normal or latent normal joint model MI is sometimes more efficient. All methods have good confidence interval coverage. Data on colorectal cancer and fibre intake from the EPIC-Norfolk study are used to illustrate the methods, in particular showing how efficiency is gained relative to just using individuals with complete data

    Simulating data from marginal structural models for a survival time outcome

    Full text link
    Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, e.g., inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here we propose a method that overcomes these restrictions. The MSM can be a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.Comment: 29 pages, 2 figure

    Effects of Classical Exposure Measurement Error on the Shape of Exposure-Disease Associations

    Get PDF
    In epidemiology many exposures of interest are measured with error. Random, or 'classical', error in exposure measurements attenuates linear exposure-disease associations. However, its precise effects on different nonlinear associations are not well known. We use simulation studies to assess how classical measurement error affects observed association shapes and power to detect nonlinearity. We focus on a proportional hazards model for the exposure-disease association and consider six true association shapes of relevance in epidemiology: linear, threshold, U-shaped, J- shaped, increasing quadratic, asymptotic. The association shapes are modeled using three popular methods: grouped exposure analyses, fractional polynomials, P-splines. Under each true association shape and each method we illustrate the effects of classical exposure measurement error, considering varying degrees of random error. We also assess what we refer to as MacMahon's method for correcting for classical exposure measurement error under grouped exposure analyses, which uses replicate measurements to estimate usual exposure within observed exposure groups. The validity of this method for nonlinear associations has not previously been investigated. Under nonlinear exposure-disease associations, classical measurement error results in increasingly linear shapes and not always an attenuated association at a given exposure level. Fractional polynomials and P-splines give similar results and offer advantages over grouped exposure analyses by providing realistic models. P-splines offer greatest power to detect nonlinearity, however random exposure measurement error results in a potential considerable loss of power to detect nonlinearity under all methods. MacMahon's method performs well for quadratic associations, but does not in general recover nonlinear shapes

    Evaluation of a five-year predicted survival model for cystic fibrosis in later time periods.

    Get PDF
    We evaluated a multivariable logistic regression model predicting 5-year survival derived from a 1993-1997 cohort from the United States Cystic Fibrosis (CF) Foundation Patient Registry to assess whether therapies introduced since 1993 have altered applicability in cohorts, non-overlapping in time, from 1993-1998, 1999-2004, 2005-2010 and 2011-2016. We applied Kaplan-Meier statistics to assess unadjusted survival. We tested logistic regression model discrimination using the C-index and calibration using Hosmer-Lemeshow tests to examine original model performance and guide updating as needed. Kaplan-Meier age-adjusted 5-year probability of death in the CF population decreased substantially during 1993-2016. Patients in successive cohorts were generally healthier at entry, with higher average age, weight and lung function and fewer pulmonary exacerbations annually. CF-related diabetes prevalence, however, steadily increased. Newly derived multivariable logistic regression models for 5-year survival in new cohorts had similar estimated coefficients to the originals. The original model exhibited excellent calibration and discrimination when applied to later cohorts despite improved survival and remains useful for predicting 5-year survival. All models may be used to stratify patients for new studies, and the original coefficients may be useful as a baseline to search for additional but rare events that affect survival in CF

    Multiple imputation of missing data in nested case-control and case-cohort studies.

    Get PDF
    The nested case-control and case-cohort designs are two main approaches for carrying out a substudy within a prospective cohort. This article adapts multiple imputation (MI) methods for handling missing covariates in full-cohort studies for nested case-control and case-cohort studies. We consider data missing by design and data missing by chance. MI analyses that make use of full-cohort data and MI analyses based on substudy data only are described, alongside an intermediate approach in which the imputation uses full-cohort data but the analysis uses only the substudy. We describe adaptations to two imputation methods: the approximate method (MI-approx) of White and Royston (2009) and the "substantive model compatible" (MI-SMC) method of Bartlett et al. (2015). We also apply the "MI matched set" approach of Seaman and Keogh (2015) to nested case-control studies, which does not require any full-cohort information. The methods are investigated using simulation studies and all perform well when their assumptions hold. Substantial gains in efficiency can be made by imputing data missing by design using the full-cohort approach or by imputing data missing by chance in analyses using the substudy only. The intermediate approach brings greater gains in efficiency relative to the substudy approach and is more robust to imputation model misspecification than the full-cohort approach. The methods are illustrated using the ARIC Study cohort. Supplementary Materials provide R and Stata code

    The Impact of Alzheimer's Disease on the Chinese Economy.

    Get PDF
    BACKGROUND: Recent increases in life expectancy may greatly expand future Alzheimer's Disease (AD) burdens. China's demographic profile, aging workforce and predicted increasing burden of AD-related care make its economy vulnerable to AD impacts. Previous economic estimates of AD predominantly focus on health system burdens and omit wider whole-economy effects, potentially underestimating the full economic benefit of effective treatment. METHODS: AD-related prevalence, morbidity and mortality for 2011-2050 were simulated and were, together with associated caregiver time and costs, imposed on a dynamic Computable General Equilibrium model of the Chinese economy. Both economic and non-economic outcomes were analyzed. FINDINGS: Simulated Chinese AD prevalence quadrupled during 2011-50 from 6-28 million. The cumulative discounted value of eliminating AD equates to China's 2012 GDP (US8trillion),andtheannualpredictedrealvalueapproachesUSADcost−of−illness(COI)estimates,exceedingUS8 trillion), and the annual predicted real value approaches US AD cost-of-illness (COI) estimates, exceeding US1 trillion by 2050 (2011-prices). Lost labor contributes 62% of macroeconomic impacts. Only 10% derives from informal care, challenging previous COI-estimates of 56%. INTERPRETATION: Health and macroeconomic models predict an unfolding 2011-2050 Chinese AD epidemic with serious macroeconomic consequences. Significant investment in research and development (medical and non-medical) is warranted and international researchers and national authorities should therefore target development of effective AD treatment and prevention strategies

    Documentation of a fully integrated epidemiological-demographic-macroeconomic model of Malaria: The case of Ghana

    Get PDF
    We develop a novel and fully integrated epidemiological-demographic-macroeconomic EDM-malaria simulation model framework for modelling of P. falciparum malaria transmission in Ghana. Our model framework represents a milestone, as the first fully integrated EDM model framework for any type of infectious disease. The complex specification and integration of regional epidemiological-demographic models within a malaria-focussed macroeconomic Computable General Equilibrium model is fully described and documented, and ideas are outlined for future applications to investigate the interplay between macroeconomic and health disease burdens, to measure the health and economic impacts of economic growth and malaria interventions, and to study the importance (or lack thereof) of the general omission of proper epidemiological underpinnings and integration of economic incentive feedback effects in the existing literature on macroeconomic assessment of infectious disease
    • …
    corecore