59 research outputs found

    Network-based analysis of gene expression data

    Get PDF
    The methods of molecular biology for the quantitative measurement of gene expression have undergone a rapid development in the past two decades. High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology. To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products. Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus. This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks. For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be scaled-up to the level of larger systems. I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators. A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast. To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime. For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented. In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of current and future questions arising from the new generation of genomic data

    Identifying a causal link between prolactin signaling pathways and COVID-19 vaccine-induced menstrual changes

    Get PDF
    COVID-19 vaccines have been instrumental tools in the fight against SARS-CoV-2 helping to reduce disease severity and mortality. At the same time, just like any other therapeutic, COVID-19 vaccines were associated with adverse events. Women have reported menstrual cycle irregularity after receiving COVID-19 vaccines, and this led to renewed fears concerning COVID-19 vaccines and their effects on fertility. Herein we devised an informatics workflow to explore the causal drivers of menstrual cycle irregularity in response to vaccination with mRNA COVID-19 vaccine BNT162b2. Our methods relied on gene expression analysis in response to vaccination, followed by network biology analysis to derive testable hypotheses regarding the causal links between BNT162b2 and menstrual cycle irregularity. Five high-confidence transcription factors were identified as causal drivers of BNT162b2-induced menstrual irregularity, namely: IRF1, STAT1, RelA (p65 NF-kB subunit), STAT2 and IRF3. Furthermore, some biomarkers of menstrual irregularity, including TNF, IL6R, IL6ST, LIF, BIRC3, FGF2, ARHGDIB, RPS3, RHOU, MIF, were identified as topological genes and predicted as causal drivers of menstrual irregularity. Our network-based mechanism reconstruction results indicated that BNT162b2 exerted biological effects similar to those resulting from prolactin signaling. However, these effects were short-lived and didn’t raise concerns about long-term infertility issues. This approach can be applied to interrogate the functional links between drugs/vaccines and other side effects

    DC-ATLAS: a systems biology resource to dissect receptor specific signal transduction in dendritic cells

    Get PDF
    BACKGROUND: The advent of Systems Biology has been accompanied by the blooming of pathway databases. Currently pathways are defined generically with respect to the organ or cell type where a reaction takes place. The cell type specificity of the reactions is the foundation of immunological research, and capturing this specificity is of paramount importance when using pathway-based analyses to decipher complex immunological datasets. Here, we present DC-ATLAS, a novel and versatile resource for the interpretation of high-throughput data generated perturbing the signaling network of dendritic cells (DCs). RESULTS: Pathways are annotated using a novel data model, the Biological Connection Markup Language (BCML), a SBGN-compliant data format developed to store the large amount of information collected. The application of DC-ATLAS to pathway-based analysis of the transcriptional program of DCs stimulated with agonists of the toll-like receptor family allows an integrated description of the flow of information from the cellular sensors to the functional outcome, capturing the temporal series of activation events by grouping sets of reactions that occur at different time points in well-defined functional modules. CONCLUSIONS: The initiative significantly improves our understanding of DC biology and regulatory networks. Developing a systems biology approach for immune system holds the promise of translating knowledge on the immune system into more successful immunotherapy strategies

    Understanding pathways

    No full text
    The challenge with todays microarray experiments is to infer biological conclusions from them. There are two crucial difficulties to be surmounted in this challenge:(1) A lack of suitable biological repository that can be easily integrated into computational algorithms. (2) Contemporary algorithms used to analyze microarray data are unable to draw consistent biological results from diverse datasets of the same disease. To deal with the first difficulty, we believe a core database that unifies available biological repositories is important. Towards this end, we create a unified biological database from three popular biological repositories (KEGG, Ingenuity and Wikipathways). This database provides computer scientists the flexibility of easily integrating biological information using simple API calls or SQL queries. To deal with the second difficulty of deriving consistent biological results from the experiments, we first conceptualize the notion of “subnetworks”, which refers to a connected portion in a biological pathway. Then we propose a method that identifies subnetworks that are consistently expressed by patients of he same disease phenotype. We test our technique on independent datasets of several diseases, including ALL, DMD and lung cancer. For each of these diseases, we obtain two independent microarray datasets produced by distinct labs on distinct platforms. In each case, our technique consistently produces overlapping lists of significant nontrivial subnetworks from two independent sets of microarray data. The gene-level agreement of these significant subnetworks is between 66.67% to 91.87%. In contrast, when the same pairs of microarray datasets were analysed using GSEA and t-test, this percentage fell between 37% to 55.75% (GSEA) and between 2.55% to 19.23% (t-test). Furthermore, the genes selected using GSEA and t-test do not form subnetworks of substantial size. Thus it is more probable that the subnetworks selected by our technique can provide the researcher with more descriptive information on the portions of the pathway which actually associates with the disease. Keywords: pathway analysis, microarra

    Network-based analysis of gene expression data

    Get PDF
    The methods of molecular biology for the quantitative measurement of gene expression have undergone a rapid development in the past two decades. High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology. To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products. Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus. This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks. For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be scaled-up to the level of larger systems. I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators. A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast. To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime. For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented. In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of current and future questions arising from the new generation of genomic data

    Investigating the Transcriptome Signature of Depression: Employing Co-expression Network, Candidate Pathways and Machine Learning Approaches

    Get PDF
    Depression is the leading cause of disability worldwide and is one of the major contributors to the overall global burden of disease. Despite significant advances in elucidating the neurobiology of depression in recent years, the molecular factors involved in the pathophysiology of depression remain poorly understood. Chapter 1: An overview of Major Depressive Disorder (MDD) from epidemiological and clinical perspectives with a summary of the current knowledge of the underlying biology is provided. A review of the major pathophysiological hypotheses of MDD highlights a need for a more comprehensive approach that allows studying complex molecular interactions involved in depression. Chapter 2: Transcriptome signature of depression was examined using the measure of replication at individual gene level across different tissues and cell types in both brain and periphery. Fifty-seven replicated genes were reported as differentially expressed in the brain and 21 in peripheral tissues. In-silico functional characterisation of these genes was provided, implicating shared pathways in a comorbid phenotype of depression and cardiovascular disease. Chapter 3: The molecular basis of MDD using co-expression network analysis was investigated. The Weighed Gene Co-expression Network Analysis (WGCNA) allowed for studying complex interactions between individual genes influencing biological pathways in MDD. Utilising the Sydney Memory and Aging Study (sMAS) and the Older Australian Twin Study (OATS) as discovery and replication cohorts respectively, it was found that the eigengenes of four clusters containing over 3,000 highly co-regulated genes are involved in 13 immune- and pathogen-related pathways and associated with recurrent MDD. However, the findings were not replicated on an independent cohort at the network level. Chapter 4: Using a machine learning (ML) approach, a predictive model was built to identify the genome-wide gene expression markers of recurrent MDD. Fuzzy Forests (FF) is a novel ML algorithm, which works in conjunction with WGCNA and was designed to reduce the bias seen in feature selection caused by the presence of correlated transcripts in transcriptome data. FF correctly classified 63% of recurrently depressed individuals in test data using the single top predictive feature (TFRC, encodes for transferrin receptor). This suggests that TFRC can represent a putative marker for recurrent MDD. Chapter 5: Following the findings on immune-related pathways being associated with recurrent MDD in the elderly (Chapter 3), the role of these pathways in recurrent MDD was examined at individual gene levels in an independent cohort (OATS). To target the immune pathways, all known genes (KEGG) involved in these 13 pathways were selected and a differential expression analysis was conducted on 1,302 candidates between individuals with recurrent MDD and those without. We found that CD14 was significantly downregulated in recurrent MDD (FDR < 5%). Considering the key role of CD14 for facilitating the innate immune response, we suggest that CD14 can potentially serve as a peripheral marker of immune dysregulation in recurrent MDD. Chapter 6: A discussion on obtained findings is provided and future directions are outlined with a particular focus on how co-expression network and machine learning approaches that can enhance translation of molecular findings into clinical translation.Thesis (Ph.D.) -- University of Adelaide, Adelaide Medical School, 201

    Multi-omics characterization of pancreatic neuroendocrine neoplasms

    Get PDF
    Pancreatic neuroendocrine neoplasms (PNENs) are biologically and clinically heterogeneous neoplasms in which pathogenic alterations are often indiscernible. Treatments for PNENs are insufficient in part due to lack of alternatives once current options are exhausted. Despite previous efforts to characterize PNENs at the molecular level, there remains a lack of molecular subgroups and molecular features with clinical utility for PNENs. In this work, I describe the identification and characterization of four molecularly distinct subgroups from primary PNEN specimens using whole-exome sequencing, RNA-sequencing and global proteome profiling. A Proliferative subgroup with molecular features of proliferating cells was associated with an inferior overall survival probability. A PDX1-high subgroup consisted of PNENs demonstrating genetic and transcriptomic indications of NRAS or HRAS activation. An Alpha cell-like subgroup, enriched in PNENs with deleterious MEN1 and DAXX mutations, bore transcriptomic similarity to pancreatic α-cells and harbored proteomic cues of dysregulated metabolism involving glutamine and arginine. Lastly, a Stromal/Mesenchymal subgroup exhibited increased expression and activation of the Hippo signaling pathway effectors YAP1 and WWTR1 that are of emerging interest as potentially actionable targets in other cancer types. Whole-genome and whole-transcriptome analysis of PNEN metastases identified novel molecular events likely contributing to pathogenesis, including one case presumably driven by MYCN amplification. In agreement with the findings in primary PNENs, four of the metastatic PNENs displayed a substantial Alpha cell-like subgroup signature and all harboured concurrent mutations in MEN1 and DAXX. Collectively, the identified subgroups present a potential stratification scheme that facilitates the identification of therapeutic vulnerabilities amidst PNEN heterogeneity to improve the effective management of PNENs

    Linking drug target and pathway activation for effective therapy using multi-task learning

    Get PDF
    Despite the abundance of large-scale molecular and drug-response data, the insights gained about the mechanisms underlying treatment efficacy in cancer has been in general limited. Machine learning algorithms applied to those datasets most often are used to provide predictions without interpretation, or reveal single drug-gene association and fail to derive robust insights. We propose to use Macau, a bayesian multitask multi-relational algorithm to generalize from individual drugs and genes and explore the interactions between the drug targets and signaling pathways' activation. A typical insight would be: "Activation of pathway Y will confer sensitivity to any drug targeting protein X". We applied our methodology to the Genomics of Drug Sensitivity in Cancer (GDSC) screening, using gene expression of 990 cancer cell lines, activity scores of 11 signaling pathways derived from the tool PROGENy as cell line input and 228 nominal targets for 265 drugs as drug input. These interactions can guide a tissue-specific combination treatment strategy, for example suggesting to modulate a certain pathway to maximize the drug response for a given tissue. We confirmed in literature drug combination strategies derived from our result for brain, skin and stomach tissues. Such an analysis of interactions across tissues might help target discovery, drug repurposing and patient stratification strategies.Medicinal Chemistr

    Leishmania mexicana induced perturbations of macrophage metabolism

    Get PDF
    The interaction between the Leishmania parasite and the macrophage is a bidirectional one of which the outcome of is important for determining if disease progresses or regresses. The parasite is able to modulate the host cell at epigenetic, transcript and metabolic levels. In the context of the latter, immune metabolism is a rapidly growing area of research, and its importance in the context of normal immune function and pathology is increasingly being recognised. In this thesis a robust, untargeted metabolomics protocol has been developed in order to profile a classical in vitro model of immune metabolism, the inflammatory M1 macrophage. While previous studies use single or multiple M1 stimuli without dissecting their importance, a combinatorial approach is used here to dissect the contribution and interaction of two key M1 stimuli, interferon Îł (IFNÎł) and LPS. An obvious stimulus-specific response is obvious in our data. We next used this untargeted metabolomics protocol in parallel with RNAseq to examine the cost of hosting a parasite to the macrophages metabolic and transcriptional profile. By using a heat-killed control it was possible to differentiate between general immune responses and response specific to the live parasite. Additionally, a FACS protocol coupled to untargeted metabolomics was used in order to focus on the infected cell. Furthermore, the inclusion of the above mentioned M1 control revealed that either live or heat-killed Leishmania failed to elicit as strong a response. Finally, stable isotope labelled metabolomics was used to validate key findings. In summary, our untargeted metabolomics protocol has revealed immune- metabolic perturbations that are induced by IFNÎł and LPS or their interaction. This information should be considered if targeting these pathways in a therapeutic context. Furthermore, by using an integrated metabolic- transcriptomics profiling approach, perturbations in glycerol-phospholipid metabolism, central carbon metabolism and arginine metabolism were found. Using stable isotope labelled metabolomics (U13C-Arginine) the current study has given unprecedented insight into how the parasite utilises this crucial amino acid, as well as confirm novel pathways
    • …
    corecore