Investigating the Transcriptome Signature of Depression: Employing Co-expression Network, Candidate Pathways and Machine Learning Approaches

Abstract

Depression is the leading cause of disability worldwide and is one of the major contributors to the overall global burden of disease. Despite significant advances in elucidating the neurobiology of depression in recent years, the molecular factors involved in the pathophysiology of depression remain poorly understood. Chapter 1: An overview of Major Depressive Disorder (MDD) from epidemiological and clinical perspectives with a summary of the current knowledge of the underlying biology is provided. A review of the major pathophysiological hypotheses of MDD highlights a need for a more comprehensive approach that allows studying complex molecular interactions involved in depression. Chapter 2: Transcriptome signature of depression was examined using the measure of replication at individual gene level across different tissues and cell types in both brain and periphery. Fifty-seven replicated genes were reported as differentially expressed in the brain and 21 in peripheral tissues. In-silico functional characterisation of these genes was provided, implicating shared pathways in a comorbid phenotype of depression and cardiovascular disease. Chapter 3: The molecular basis of MDD using co-expression network analysis was investigated. The Weighed Gene Co-expression Network Analysis (WGCNA) allowed for studying complex interactions between individual genes influencing biological pathways in MDD. Utilising the Sydney Memory and Aging Study (sMAS) and the Older Australian Twin Study (OATS) as discovery and replication cohorts respectively, it was found that the eigengenes of four clusters containing over 3,000 highly co-regulated genes are involved in 13 immune- and pathogen-related pathways and associated with recurrent MDD. However, the findings were not replicated on an independent cohort at the network level. Chapter 4: Using a machine learning (ML) approach, a predictive model was built to identify the genome-wide gene expression markers of recurrent MDD. Fuzzy Forests (FF) is a novel ML algorithm, which works in conjunction with WGCNA and was designed to reduce the bias seen in feature selection caused by the presence of correlated transcripts in transcriptome data. FF correctly classified 63% of recurrently depressed individuals in test data using the single top predictive feature (TFRC, encodes for transferrin receptor). This suggests that TFRC can represent a putative marker for recurrent MDD. Chapter 5: Following the findings on immune-related pathways being associated with recurrent MDD in the elderly (Chapter 3), the role of these pathways in recurrent MDD was examined at individual gene levels in an independent cohort (OATS). To target the immune pathways, all known genes (KEGG) involved in these 13 pathways were selected and a differential expression analysis was conducted on 1,302 candidates between individuals with recurrent MDD and those without. We found that CD14 was significantly downregulated in recurrent MDD (FDR < 5%). Considering the key role of CD14 for facilitating the innate immune response, we suggest that CD14 can potentially serve as a peripheral marker of immune dysregulation in recurrent MDD. Chapter 6: A discussion on obtained findings is provided and future directions are outlined with a particular focus on how co-expression network and machine learning approaches that can enhance translation of molecular findings into clinical translation.Thesis (Ph.D.) -- University of Adelaide, Adelaide Medical School, 201

    Similar works