2,485 research outputs found

    A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis

    Get PDF
    Motivation: Deregulated signaling cascades are known to play a crucial role in many pathogenic processes, among them are tumor initiation and progression. In the recent past, modern experimental techniques that allow for measuring the amount of mRNA transcripts of almost all known human genes in a tissue or even in a single cell have opened new avenues for studying the activity of the signaling cascades and for understanding the information flow in the networks

    Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

    Get PDF
    In complex diseases, various combinations of genomic perturbations often lead to the same phenotype. On a molecular level, combinations of genomic perturbations are assumed to dys-regulate the same cellular pathways. Such a pathway-centric perspective is fundamental to understanding the mechanisms of complex diseases and the identification of potential drug targets. In order to provide an integrated perspective on complex disease mechanisms, we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. First, we identified a representative set of genes that are differentially expressed in cancer compared to non-tumor control cases. Assuming that disease-associated gene expression changes are caused by genomic alterations, we determined potential paths from such genomic causes to target genes through a network of molecular interactions. Applying our method to sets of genomic alterations and gene expression profiles of 158 Glioblastoma multiforme (GBM) patients we uncovered candidate causal genes and causal paths that are potentially responsible for the altered expression of disease genes. We discovered a set of putative causal genes that potentially play a role in the disease. Combining an expression Quantitative Trait Loci (eQTL) analysis with pathway information, our approach allowed us not only to identify potential causal genes but also to find intermediate nodes and pathways mediating the information flow between causal and target genes. Our results indicate that different genomic perturbations indeed dys-regulate the same functional pathways, supporting a pathway-centric perspective of cancer. While copy number alterations and gene expression data of glioblastoma patients provided opportunities to test our approach, our method can be applied to any disease system where genetic variations play a fundamental causal role

    MicroRNA co-expression networks exhibit increased complexity in pancreatic ductal compared to Vater’s papilla adenocarcinoma

    Get PDF
    iRNA expression abnormalities in adenocarcinoma arising from pancreatic ductal system (PDAC) and Vater’s papilla (PVAC) could be associated with distinctive pathologic features and clinical cancer behaviours. Our previous miRNA expression profiling data on PDAC (n=9) and PVAC (n=4) were revaluated to define differences/ similarities in miRNA expression patterns. Afterwards, in order to uncover target genes and core signalling pathways regulated by specific miRNAs in these two tumour entities, miRNA interaction networks were wired for each tumour entity, and experimentally validated target genes underwent pathways enrichment analysis. One hundred and one miRNAs were altered, mainly over-expressed, in PDAC samples. Twenty-six miRNAs were deregulated in PVAC samples, where more miRNAs were down-expressed in tumours compared to normal tissues. Four miRNAs were significantly altered in both subgroups of patients, while 27 miRNAs were differentially expressed between PDAC and PVAC. Although miRNA interaction networks were more complex and dense in PDAC than in PVAC, pathways enrichment analysis uncovered a functional overlapping between PDAC and PVAC. However, shared signalling events were influenced by different miRNA and/or genes in the two tumour entities. Overall, specific miRNA expression patterns were involved in the regulation of a limited core signalling pathways in the biology landscape of PDAC and PVAC

    Transcriptomic signatures of neuronal differentiation and their association with risk genes for autism spectrum and related neuropsychiatric disorders.

    Get PDF
    Genes for autism spectrum disorders (ASDs) are also implicated in fragile X syndrome (FXS), intellectual disabilities (ID) or schizophrenia (SCZ), and converge on neuronal function and differentiation. The SH-SY5Y neuroblastoma cell line, the most widely used system to study neurodevelopment, is currently discussed for its applicability to model cortical development. We implemented an optimal neuronal differentiation protocol of this system and evaluated neurodevelopment at the transcriptomic level using the CoNTeXT framework, a machine-learning algorithm based on human post-mortem brain data estimating developmental stage and regional identity of transcriptomic signatures. Our improved model in contrast to currently used SH-SY5Y models does capture early neurodevelopmental processes with high fidelity. We applied regression modelling, dynamic time warping analysis, parallel independent component analysis and weighted gene co-expression network analysis to identify activated gene sets and networks. Finally, we tested and compared these sets for enrichment of risk genes for neuropsychiatric disorders. We confirm a significant overlap of genes implicated in ASD with FXS, ID and SCZ. However, counterintuitive to this observation, we report that risk genes affect pathways specific for each disorder during early neurodevelopment. Genes implicated in ASD, ID, FXS and SCZ were enriched among the positive regulators, but only ID-implicated genes were also negative regulators of neuronal differentiation. ASD and ID genes were involved in dendritic branching modules, but only ASD risk genes were implicated in histone modification or axonal guidance. Only ID genes were over-represented among cell cycle modules. We conclude that the underlying signatures are disorder-specific and that the shared genetic architecture results in overlaps across disorders such as ID in ASD. Thus, adding developmental network context to genetic analyses will aid differentiating the pathophysiology of neuropsychiatric disorders

    Differential analysis of biological networks

    Get PDF
    In cancer research, the comparison of gene expression or DNA methylation networks inferred from healthy controls and patients can lead to the discovery of biological pathways associated to the disease. As a cancer progresses, its signalling and control networks are subject to some degree of localised re-wiring. Being able to detect disrupted interaction patterns induced by the presence or progression of the disease can lead to the discovery of novel molecular diagnostic and prognostic signatures. Currently there is a lack of scalable statistical procedures for two-network comparisons aimed at detecting localised topological differences. We propose the dGHD algorithm, a methodology for detecting differential interaction patterns in two-network comparisons. The algorithm relies on a statistic, the Generalised Hamming Distance (GHD), for assessing the degree of topological difference between networks and evaluating its statistical significance. dGHD builds on a non-parametric permutation testing framework but achieves computationally efficiency through an asymptotic normal approximation. We show that the GHD is able to detect more subtle topological differences compared to a standard Hamming distance between networks. This results in the dGHD algorithm achieving high performance in simulation studies as measured by sensitivity and specificity. An application to the problem of detecting differential DNA co-methylation subnetworks associated to ovarian cancer demonstrates the potential benefits of the proposed methodology for discovering network-derived biomarkers associated with a trait of interest

    Bioinformatic Reconstruction of Gene Regulatory Networks Controlling EMT and Mesoderm Formation

    Get PDF
    Embryonic development is a complex multi-stage process, which at the gene expression level requires precise control by gene regulatory networks (GRNs). At each stage of pattern formation and organogenesis, during the transition of precursor cells to their descendants, various sets of signaling molecules and transcription factors (TFs) activate or repress their target genes to determine distinct cell fates. Misregulation of developmental pathways may cause severe diseases or lethality, while their ectopic activation in the adult organism often results in oncogenic transformation. It is therefore of great importance to decode the transcription factors and understand how they interact and form GRNs controlling developmental processes. Mesoderm formation is vital for embryo development. It occurs during gastrulation and depends on the process of epithelial-mesenchymal transition (EMT). In vertebrates, mesoderm gives rise to various tissues, such as axial skeleton, skeletal muscle, heart, kidney, smooth muscles, blood vessels and blood. A plethora of studies has been focused on characterizing the genes that regulate the development of mesoderm. Signaling pathways including WNT, BMP and FGF, along with transcription factors such as Smads, Eomes and T have been reported to play fundamental roles in this process. However, the comprehensive mechanistic characterization of the mesodermal GRNs is still lacking. This study aims at constructing a global gene regulatory network, which describes transcriptional regulatory events occurring dynamically during the course of mesoderm formation in the mouse. We demonstrated that in vitro mesodermal differentiation of mouse embryonic stem cells mimics mesoderm formation in vivo, and therefore chose it as a model system. Firstly, by combining ChIP-seq and RNA-seq techniques, I reconstructed GRNs mediated by the essential mesodermal TFs Smads, Eomes and T. Next, to build global dynamic GRN orchestrating EMT and mesoderm formation, time-series gene expression and TF-target datasets were integrated. The latter was obtained by an original method of discovering functionally active TFs from ATAC-seq data, followed by their association with putative target genes. Combing this method with a bioinformatical tool based on hidden Markov model allowed me to identify groups of co-expressed genes from time-series transcriptome data and predict TFs that regulate their expression. The predictive power of this approach was validated by comparing its output with the Smads, Eomes and T datasets, demonstrating that it correctly assigned these TFs to their targets. Using this unbiased approach, novel candidate mesodermal TFs and target genes of previously known TFs were identified. This study expands our understanding of genetic regulation mechanisms underlying EMT and mesoderm formation in the mouse and provides a list of novel potential mesoderm regulators for future in-depth characterization. This bioinformatical approach thus is promising in future studies designed to characterize the molecular mechanism underlying specific developmental processes.Die Embryonalentwicklung ist ein komplexer mehrstufiger Vorgang, der auf der genetischen Ebene eine präzise Kontrolle durch Genregulationsnetzwerke (GRNs) erfordert. Während der Differenzierung von Vorläuferzellen in ihre Nachkommen aktivieren oder unterdrücken verschiedene Gruppen von Transkriptionsfaktoren (TFs) auf jeder Stufe der Musterbildung und der Organogenese ihre Zielgene um bestimmte Zellschicksale festzulegen. Eine Fehlregulation verschiedener Entwicklungsvorgänge kann zu schweren Krankheiten oder zum Tode führen, während deren ektopische Aktivierung im adulten Organismus die Ausbildung von Tumoren induzieren kann. Aus diesem Grund ist es von großer Bedeutung die entsprechenden Transkriptionsfaktoren zu entschlüsseln und herauszufinden, wie sie zum einen interagieren und zum anderen ein GRN bilden das die Entwicklungsprozesse kontrolliert. Die Entstehung des Mesoderms ist bei der Embryonalentwicklung von großer Bedeutung. Sie findet während der Gastrulation statt und ist abhängig von der epithelial-mesenchymalen Transition (EMT). In Wirbeltieren entstehen aus dem Mesoderm verschiedene Gewebe: das axiale Skelett, die Skelettmuskulatur, das Herz, die Nieren, die Blutgefäße und das Blut. In einer Fülle von Studien wurde erläutert, welche Gene die Entstehung des Mesoderms beeinflussen. So ist bekannt, dass die WNT-, BMP- und FGF-Signalwege, zusammen mit TFs, vor allem Smads, Eomes und T, eine grundlegende Rolle bei diesen Vorgängen spielen. Allerdings gibt es bis jetzt noch keine umfassende und mechanistische Beschreibung des mesodermalen GRN. Das Ziel dieser Arbeit ist es, ein globales Genregulationsnetzwerk zu erstellen, welches die transkriptionellen regulatorischen Ereignisse, die dynamisch während der Entstehung des Mesoderms in der Maus auftreten, zu beschreiben. Wir konnten nachweisen, dass die in-vitro Differenzierung von murinen embryonalen Stammzellen die Entstehung des Mesoderms in- vivo nachahmen kann. Aus diesem Grund verwenden wir die in-vitro Differenzierung als Modellsystem. Durch die kombinierte Anwendung von ChIP-Seq- und RNA-Seq-Techniken habe ich zuerst GRNs rekonstruiert, welche durch die für die Mesodermentwicklung wichtigen TFs Smads, Eomes und T gesteuert werden. Um ein globales Genregulationsnetzwerk, das die EMT und die Mesodermentwicklung steuert, zu erstellen, haben wir des weiteren Genexpression-Zeitreihen und Datensätze von Zielgenen bekannter TFs miteinander integriert. Letztere wurden durch einen originären Ansatz erzielt mit dem die funktional aktiven TFs aus ATAC-Seq-Daten ermittelt und mit ihren mutmaßlichen Zielgenen assoziert wurden. Zusammen mit einem bioinformatischen Programm, das auf einem „hidden Markov-Modell“ basiert, konnte ich so Gruppen von koexprimierten Genen identifizieren und die TFs vorhersagen, welche deren Expression regulieren. Wir konnten die Vorhersagekraft unseres Ansatzes bestätigen und beweisen, dass er die TFs ihren Zielen korrekt zuordnet, indem wir die Ergebnisse mit unseren Datensätzen von Smads, Eomes und T verglichen haben. Mittels dieses de novo Ansatzes haben wir sowohl neue Kandidaten für mesodermale TFs identifiziert als auch die sich dynamisch ändernden Gruppen von Zielgenen von schon bekannten TFs charakterisiert. Diese Arbeit erweitert unser Verständnis der der EMT und der Entstehung des Mesoderms zugrundeliegenden genregulatorischen Prozesse in der Maus und stellt eine Liste an neuen potentiellen Regulatoren des Mesoderms für deren zukünftige detaillierte Beschreibungen zur Verfügung. Dieser bioinformatische Ansatz ist daher ein vielversprechender Ansatz für zukünftige Studien, deren Ziel die Charakterisierung molekularer Mechanismen anderer wichtiger Entwicklungsprozesse ist

    Discovering cancer-associated transcripts by RNA sequencing

    Full text link
    High-throughput sequencing of poly-adenylated RNA (RNA-Seq) in human cancers shows remarkable potential to identify uncharacterized aspects of tumor biology, including gene fusions with therapeutic significance and disease markers such as long non-coding RNA (lncRNA) species. However, the analysis of RNA-Seq data places unprecedented demands upon computational infrastructures and algorithms, requiring novel bioinformatics approaches. To meet these demands, we present two new open-source software packages - ChimeraScan and AssemblyLine - designed to detect gene fusion events and novel lncRNAs, respectively. RNA-Seq studies utilizing ChimeraScan led to discoveries of new families of recurrent gene fusions in breast cancers and solitary fibrous tumors. Further, ChimeraScan was one of the key components of the repertoire of computational tools utilized in data analysis for MI-ONCOSEQ, a clinical sequencing initiative to identify potentially informative and actionable mutations in cancer patients’ tumors. AssemblyLine, by contrast, reassembles RNA sequencing data into full-length transcripts ab initio. In head-to-head analyses AssemblyLine compared favorably to existing ab initio approaches and unveiled abundant novel lncRNAs, including antisense and intronic lncRNAs disregarded by previous studies. Moreover, we used AssemblyLine to define the prostate cancer transcriptome from a large patient cohort and discovered myriad lncRNAs, including 121 prostate cancer-associated transcripts (PCATs) that could potentially serve as novel disease markers. Functional studies of two PCATs - PCAT-1 and SChLAP1 - revealed cancer-promoting roles for these lncRNAs. PCAT1, a lncRNA expressed from chromosome 8q24, promotes cell proliferation and represses the tumor suppressor BRCA2. SChLAP1, located in a chromosome 2q31 ‘gene desert’, independently predicts poor patient outcomes, including metastasis and cancer-specific mortality. Mechanistically, SChLAP1 antagonizes the genome-wide localization and regulatory functions of the SWI/SNF chromatin-modifying complex. Collectively, this work demonstrates the utility of ChimeraScan and AssemblyLine as open-source bioinformatics tools. Our applications of ChimeraScan and AssemblyLine led to the discovery of new classes of recurrent and clinically informative gene fusions, and established a prominent role for lncRNAs in coordinating aggressive prostate cancer, respectively. We expect that the methods and findings described herein will establish a precedent for RNA-Seq-based studies in cancer biology and assist the research community at large in making similar discoveries.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120814/1/mkiyer_1.pd

    Identification of a gene signature for discriminating metastatic from primary melanoma using a molecular interaction network approach

    Get PDF
    Understanding the biological factors that are characteristic of metastasis in melanoma remains a key approach to improving treatment. In this study, we seek to identify a gene signature of metastatic melanoma. We configured a new network-based computational pipeline, combined with a machine learning method, to mine publicly available transcriptomic data from melanoma patient samples. Our method is unbiased and scans a genome-wide protein-protein interaction network using a novel formulation for network scoring. Using this, we identify the most influential, differentially expressed nodes in metastatic as compared to primary melanoma. We evaluated the shortlisted genes by a machine learning method to rank them by their discriminatory capacities. From this, we identified a panel of 6 genes, ALDH1A1, HSP90AB1, KIT, KRT16, SPRR3 and TMEM45B whose expression values discriminated metastatic from primary melanoma (87% classification accuracy). In an independent transcriptomic data set derived from 703 primary melanomas, we showed that all six genes were significant in predicting melanoma specific survival (MSS) in a univariate analysis, which was also consistent with AJCC staging. Further, 3 of these genes, HSP90AB1, SPRR3 and KRT16 remained significant predictors of MSS in a joint analysis (HR = 2.3, P = 0.03) although, HSP90AB1 (HR = 1.9, P = 2 × 10−4) alone remained predictive after adjusting for clinical predictors
    corecore