The application and development of methods to combine and infer information from genetic epidemiological studies of cardiovascular and other complex traits
This thesis investigates methods to combine and infer information from genetic
epidemiological studies. Three issues are explored, each in a distinct and self-contained
chapter.
Chapter 1 investigates how best to incorporate treatment information in genetic
analyses of blood pressure. Different approaches to adjusting for treatment are
compared in a number of simulated scenarios, and the approaches that utilise
all the observed data are generally shown to perform best. One particular
condition, however, causes these approaches to suffer bias. This is where a
genetic variant (or some other factor) interacts with treatment. This chapter
therefore urges caution in the interpretation of results from these studies, and
suggests some possible approaches to identifying existing interactions with
treatment.
Chapter 2 concerns participant privacy in genome-wide association studies
(GWAS). Recent methods claim to be able to infer whether an individual
participated in a study, using only aggregate statistics from the study such as
allele frequencies. In the past, these statistics have been freely published
online. This chapter explores the full implications of these methods, by
investigating their true capabilities and limitations. In addition, some
modifications are proposed to one particular method, to demonstrate how it can
be adapted for use in practice. This work finds that participant identification is
possible in ideal conditions, but common characteristics of real studies may
prevent any reliable application of these methods in practice.
Chapter 3 proposes a new approach to synthesising data between studies.
This approach – named “DataSHIELD” – guarantees identical results to an
individual-level meta-analysis, while offering greater flexibility than the studylevel
meta-analysis. DataSHIELD also potentially circumvents some of the laws
that restrict data use, because it does not involve sharing any individual-level
data between studies. This chapter outlines the principles underpinning
DataSHIELD, and demonstrates its use in a simulated data example