Investigating the genetic and immunological aetiology of myalgic encephalomyelitis/chronic fatigue syndrome

Abstract

This thesis describes two investigations into the disease Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS), specifically its genetic aetiology and immune system alterations. The first study investigated the genetic basis of ME/CFS using Genome-wide Association Studies (GWAS) by attempting to replicate and extend results previously found using UK Biobank cohort data. GWAS attempt to identify associations between DNA variants and phenotypes. This GWAS was novel, conducted on new phenotypes constructed by combining those in the most up-to-date UK Biobank data release. A new, previously unseen, genome-wide significant association was found on chromosome 6 for males with ME/CFS within the gene PDE10A. Further results were not genome-wide significant, but many were suggestive and hence independent replication may justify further research. A previous analysis on the UK Biobank cohort had identified an indicative association in females between variants around the SLC25A15 gene at genome-wide significance. I adopted a hypothesis that the dietary protein intake of people with the CFS risk variants would be lower than those with the alternative alleles, due to potentially reduced production of mitochondrial ornithine transporter 1 (ORNT1). However, this association with dietary protein intake was not supported by UK Biobank data. Additionally, I investigated associations between the human leukocyte antigen (HLA) alleles and the ME/CFS phenotype using UK Biobank data. Associations between alleles within the HLA-C and -DQB1 genes had previously been found in a cohort of Norwegian people with ME/CFS, and my goal was to seek replication of these results in a larger dataset. None of the associations found in the UK Biobank proved to be genome-wide significant. In my second study I investigated the use of T-cell clonal diversity as a potential biomarker for ME/CFS. This project used cells from CureME Biobank samples in collaboration with Systems Biology Laboratory (SBL). I developed a data analysis pipeline to analyse T-cell receptor (TCR) genomic DNA data based on the best practices currently used in the fields of immunology and mathematical biology. This approach used a mathematical notion of entropy as a measure for the diversity of TCR repertoires, in this way combining all of the most commonly used metrics in mathematical biology. When combined, these measures form a profile for each repertoire, which can be sorted using a machine learning algorithm to partition the repertoires into subgroups. My hypothesis was that the T-cell clonal expansion of people with ME/CFS would be greater than for healthy controls, and comparable to disease (multiple sclerosis) controls. Although this method was able to effectively classify TCR chains using simulated data, results from experimentally-derived data did not support the hypothesis, with the most effective classifications for both CD4+ and CD8+ cells failing to pass corrections for multiple hypothesis significance testing

    Similar works