389 research outputs found

    Falsifiable Network Models. A Network-based Approach to Predict Treatment Efficacy in Ulcerative Colitis

    Get PDF
    This work is focused on understanding the treatment efficacy of patients with ulcerative colitis (UC) using a network-based approach. UC is one of two forms of inflammatory bowel disease (IBD) along with Crohn’s disease. UC is a debilitating condition characterized by chronic inflammation and ulceration of the colon and rectum. UC symptoms occur gradually rather than abruptly, and the degree of symptoms differs across UC patients. Only around 20% of all UC cases can be explained by known genetic variations, implying a more ambiguous aetiology that is yet not fully understood but is thought to involve a complex interplay between genetic and environmental factors. The available therapy for UC substantially reduces symptoms and achieves long-term remission. However, about one-third of UC patients fail to respond to anti-TNFα therapy and consequently develop long-term side effects due to medication. Non-response to existing antibody-based therapies in subgroups of UC patients is a major challenge and incurs a healthcare burden. Therefore, the disease markers for predicting therapy response to assist individualized therapy decisions are needed. To date, no quantitative computational framework is available to predict treatment response in UC. We developed a quantitative framework that uses gene expression data and existing biological background information on signalling pathways to quantify network connectivity from receptors to transcription factors (TF) that are involved in UC pathogenesis. Variations in network connectivity in UC patients can be used to identify responders and non-responders to anti-TNFα and anti-Integrin treatment. Our findings allow us to summarize the effect of small gene expression changes on the overall connectivity of a signalling network and estimate the effect this will have on the individual patients' responses. Estimating the network connectivity associated with varied drug responses may provide an understanding of individualized treatment outcomes. Our model could be used to generate testable hypotheses about how individual genes act together in networks to cause inflammation in UC as well as other immune-inflammatory diseases such as psoriasis, asthma, and rheumatoid arthritis

    Finding the pathology of major depression through effects on gene interaction networks

    Get PDF
    The disease signature of major depressive disorder is distributed across multiple physical scales and investigative specialties, including genes, cells and brain regions. No single mechanism or pathway currently implicated in depression can reproduce its diverse clinical presentation, which compounds the difficulty in finding consistently disrupted molecular functions. We confront these key roadblocks to depression research - multi-scale and multi-factor pathology - by conducting parallel investigations at the levels of genes, neurons and brain regions, using transcriptome networks to identify collective patterns of dysfunction. Our findings highlight how the collusion of multi-system deficits can form a broad-based, yet variable pathology behind the depressed phenotype. For instance, in a variant of the classic lethality-centrality relationship, we show that in neuropsychiatric disorders including major depression, differentially expressed genes are pushed out to the periphery of gene networks. At the level of cellular function, we develop a molecular signature of depression based on cross-species analysis of human and mouse microarrays from depression-affected areas, and show that these genes form a tight module related to oligodendrocyte function and neuronal growth/structure. At the level of brain-region communication, we find a set of genes and hormones associated with the loss of feedback between the amygdala and anterior cingulate cortex, based on a novel assay of interregional expression synchronization termed "gene coordination". These results indicate that in the absence of a single pathology, depression may be created by dysynergistic effects among genes, cell-types and brain regions, in what we term the "floodgate" model of depression. Beyond our specific biological findings, these studies indicate that gene interaction networks are a coherent framework in which to understand the faint expression changes found in depression and complex neuropsychiatric disorders

    Analysis of High-dimensional and Left-censored Data with Applications in Lipidomics and Genomics

    Get PDF
    Recently, there has been an occurrence of new kinds of high- throughput measurement techniques enabling biological research to focus on fundamental building blocks of living organisms such as genes, proteins, and lipids. In sync with the new type of data that is referred to as the omics data, modern data analysis techniques have emerged. Much of such research is focusing on finding biomarkers for detection of abnormalities in the health status of a person as well as on learning unobservable network structures representing functional associations of biological regulatory systems. The omics data have certain specific qualities such as left-censored observations due to the limitations of the measurement instruments, missing data, non-normal observations and very large dimensionality, and the interest often lies in the connections between the large number of variables. There are two major aims in this thesis. First is to provide efficient methodology for dealing with various types of missing or censored omics data that can be used for visualisation and biomarker discovery based on, for example, regularised regression techniques. Maximum likelihood based covariance estimation method for data with censored values is developed and the algorithms are described in detail. Second major aim is to develop novel approaches for detecting interactions displaying functional associations from large-scale observations. For more complicated data connections, a technique based on partial least squares regression is investigated. The technique is applied for network construction as well as for differential network analyses both on multiple imputed censored data and next- generation sequencing count data.Uudet mittausteknologiat ovat mahdollistaneet kokonaisvaltaisen ymmärryksen lisäämisen elollisten organismien molekyylitason prosesseista. Niin kutsutut omiikka-teknologiat, kuten genomiikka, proteomiikka ja lipidomiikka, kykenevät tuottamaan valtavia määriä mittausdataa yksittäisten geenien, proteiinien ja lipidien ekspressio- tai konsentraatiotasoista ennennäkemättömällä tarkkuudella. Samanaikaisesti tarve uusien analyysimenetelmien kehittämiselle on kasvanut. Kiinnostuksen kohteena ovat olleet erityisesti tiettyjen sairauksien riskiä tai prognoosia ennustavien merkkiaineiden tunnistaminen sekä biologisten verkkojen rekonstruointi. Omiikka-aineistoilla on useita erityisominaisuuksia, jotka rajoittavat tavanomaisten menetelmien suoraa ja tehokasta soveltamista. Näistä tärkeimpiä ovat vasemmalta sensuroidut ja puuttuvat havainnot, sekä havaittujen muuttujien suuri lukumäärä. Tämän väitöskirjan ensimmäisenä tavoitteena on tarjota räätälöityjä analyysimenetelmiä epätäydellisten omiikka-aineistojen visualisointiin ja mallin valintaan käyttäen esimerkiksi regularisoituja regressiomalleja. Kuvailemme myös sensuroidulle aineistolle sopivan suurimman uskottavuuden estimaattorin kovarianssimatriisille. Toisena tavoitteena on kehittää uusia menetelmiä omiikka-aineistojen assosiaatiorakenteiden tarkasteluun. Monimutkaisempien rakenteiden tarkasteluun, visualisoimiseen ja vertailuun esitetään erilaisia variaatioita osittaisen pienimmän neliösumman menetelmään pohjautuvasta algoritmista, jonka avulla voidaan rekonstruoida assosiaatioverkkoja sekä multi-imputoidulle sensuroidulle että lukumääräaineistoille.Siirretty Doriast

    Network-based approaches for multi-omic data integration

    Get PDF
    The advent of advanced high-throughput biological technologies provides opportunities to measure the whole genome at different molecular levels in biological systems, which produces different types of omic data such as genome, epigenome, transcriptome, translatome, proteome, metabolome and interactome. Biological systems are highly dynamic and complex mechanisms which involve not only the within-level functionality but also the between-level regulation. In order to uncover the complexity of biological systems, it is desirable to integrate multi-omic data to transform the multiple level data into biological knowledge about the underlying mechanisms. Due to the heterogeneity and high-dimension of multi-omic data, it is necessary to develop effective and efficient methods for multi-omic data integration. This thesis aims to develop efficient approaches for multi-omic data integration using machine learning methods and network theory. We assume that a biological system can be represented by a network with nodes denoting molecules and edges indicating functional links between molecules, in which multi-omic data can be integrated as attributes of nodes and edges. We propose four network-based approaches for multi-omic data integration using machine learning methods. Firstly, we propose an approach for gene module detection by integrating multi-condition transcriptome data and interactome data using network overlapping module detection method. We apply the approach to study the transcriptome data of human pre-implantation embryos across multiple development stages, and identify several stage-specific dynamic functional modules and genes which provide interesting biological insights. We evaluate the reproducibility of the modules by comparing with some other widely used methods and show that the intra-module genes are significantly overlapped between the different methods. Secondly, we propose an approach for gene module detection by integrating transcriptome, translatome, and interactome data using multilayer network. We apply the approach to study the ribosome profiling data of mTOR perturbed human prostate cancer cells and mine several translation efficiency regulated modules associated with mTOR perturbation. We develop an R package, TERM, for implementation of the proposed approach which offers a useful tool for the research field. Next, we propose an approach for feature selection by integrating transcriptome and interactome data using network-constrained regression. We develop a more efficient network-constrained regression method eGBL. We evaluate its performance in term of variable selection and prediction, and show that eGBL outperforms the other related regression methods. With application on the transcriptome data of human blastocysts, we select several interested genes associated with time-lapse parameters. Finally, we propose an approach for classification by integrating epigenome and transcriptome data using neural networks. We introduce a superlayer neural network (SNN) model which learns DNA methylation and gene expression data parallelly in superlayers but with cross-connections allowing crosstalks between them. We evaluate its performance on human breast cancer classification. The SNN provides superior performances and outperforms several other common machine learning methods. The approaches proposed in this thesis offer effective and efficient solutions for integration of heterogeneous high-dimensional datasets, which can be easily applied to other datasets presenting the similar structures. They are therefore applicable to many fields including but not limited to Bioinformatics and Computer Science.EU Commission Marie Curie Actions FP7-PEOPLE-2012-ITN-317146-EpiHealthNe
    corecore