Abstract

Background: We evaluated methods for the analysis of multi-level survival data using a pooled dataset of 14 cohorts participating in the ELAPSE project investigating associations between residential exposure to low levels of air pollution (PM2.5 and NO2) and health (natural-cause mortality and cerebrovascular, coronary and lung cancer incidence). Methods: We applied five approaches in a multivariable Cox model to account for the first level of clustering corresponding to cohort specification: (1) not accounting for the cohort or using (2) indicator variables, (3) strata, (4) a frailty term in frailty Cox models, (5) a random intercept under a mixed Cox, for cohort identification. We accounted for the second level of clustering due to common characteristics in the residential area by (1) a random intercept per small area or (2) applying variance correction. We assessed the stratified, frailty and mixed Cox approach through simulations under different scenarios for heterogeneity in the underlying hazards and the air pollution effects. Results: Effect estimates were stable under approaches used to adjust for cohort but substantially differed when no adjustment was applied. Further adjustment for the small area grouping increased the effect estimates’ standard errors. Simulations confirmed identical results between the stratified and frailty models. In ELAPSE we selected a stratified multivariable Cox model to account for between-cohort heterogeneity without adjustment for small area level, due to the small number of subjects and events in the latter. Conclusions: Our study supports the need to account for between-cohort heterogeneity in multi-center collaborations using pooled individual level data. © 2021 The Author

    Similar works