Generalized estimating equations for genome-wide association studies using longitudinal phenotype data

Avery, Christy L.; Cupples, L. Adrienne; Lumley, Thomas; McKnight, Barbara; Noordam, Raymond; Psaty, Bruce M.; Rice, Kenneth M.; Sitlani, Colleen M.; Stricker, Bruno H. C.; Whitsel, Eric A.

Generalized estimating equations for genome-wide association studies using longitudinal phenotype data

Authors: Christy L. Avery
L. Adrienne Cupples
Thomas Lumley
Barbara McKnight
Raymond Noordam
Bruce M. Psaty
Kenneth M. Rice
Colleen M. Sitlani
Bruno H. C. Stricker
Eric A. Whitsel
Publication date: 1 January 2015
Publisher
Doi

Abstract

Many longitudinal cohort studies have both genome-wide measures of genetic variation and repeated measures of phenotypes and environmental exposures. Genome-wide association study analyses have typically used only cross-sectional data to evaluate quantitative phenotypes and binary traits. Incorporation of repeated measures may increase power to detect associations, but also requires specialized analysis methods. Here we discuss one such method – generalized estimating equations (GEE) – in the contexts of analysis of main effects of rare genetic variants and analysis of gene-environment interactions. We illustrate the potential for increased power using GEE analyses instead of cross-sectional analyses. We also address challenges that arise, such as the need for small-sample corrections when the minor allele frequency of a genetic variant and/or the prevalence of an environmental exposure is low. To illustrate methods for detection of gene-drug interactions on a genome-wide scale, using repeated measures data, we conduct single-study analyses and meta-analyses across studies in three large cohort studies participating in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium – the Atherosclerosis Risk in Communities (ARIC) study, the Cardiovascular Health Study (CHS), and the Rotterdam Study (RS)