253 research outputs found

    Gene Regulatory Network Reconstruction Using Dynamic Bayesian Networks

    Get PDF
    High-content technologies such as DNA microarrays can provide a system-scale overview of how genes interact with each other in a network context. Various mathematical methods and computational approaches have been proposed to reconstruct GRNs, including Boolean networks, information theory, differential equations and Bayesian networks. GRN reconstruction faces huge intrinsic challenges on both experimental and theoretical fronts, because the inputs and outputs of the molecular processes are unclear and the underlying principles are unknown or too complex. In this work, we focused on improving the accuracy and speed of GRN reconstruction with Dynamic Bayesian based method. A commonly used structure-learning algorithm is based on REVEAL (Reverse Engineering Algorithm). However, this method has some limitations when it is used for reconstructing GRNs. For instance, the two-stage temporal Bayes network (2TBN) cannot be well recovered by application of REVEAL; it has low accuracy and speed for high dimensionality networks that has above a hundred nodes; and it even cannot accomplish the task of reconstructing a network with 400 nodes. We implemented an algorithm for DBN structure learning with Friedman\u27s score function to replace REVEAL, and tested it on reconstruction of both synthetic networks and real yeast networks and compared it with REVEAL in the absence or presence of preprocessed network generated by Zou and Conzen\u27s algorithm. The new score metric improved the precision and recall of GRN reconstruction. Networks of gene interactions were reconstructed using a Dynamic Bayesian Network (DBN) approach and were analyzed to identify the mechanism of chemical-induced reversible neurotoxicity through reconstruction of gene regulatory networks in earthworms with tools curating relevant genes from non-model organism\u27s pathway to model organism pathway

    Constrained expectation-maximization (EM), dynamic analysis, linear quadratic tracking, and nonlinear constrained expectation-maximation (EM) for the analysis of genetic regulatory networks and signal transduction networks

    Get PDF
    Despite the immense progress made by molecular biology in cataloging andcharacterizing molecular elements of life and the success in genome sequencing, therehave not been comparable advances in the functional study of complex phenotypes.This is because isolated study of one molecule, or one gene, at a time is not enough byitself to characterize the complex interactions in organism and to explain the functionsthat arise out of these interactions. Mathematical modeling of biological systems isone way to meet the challenge.My research formulates the modeling of gene regulation as a control problem andapplies systems and control theory to the identification, analysis, and optimal controlof genetic regulatory networks. The major contribution of my work includes biologicallyconstrained estimation, dynamical analysis, and optimal control of genetic networks.In addition, parameter estimation of nonlinear models of biological networksis also studied, as a parameter estimation problem of a general nonlinear dynamicalsystem. Results demonstrate the superior predictive power of biologically constrainedstate-space models, and that genetic networks can have differential dynamic propertieswhen subjected to different environmental perturbations. Application of optimalcontrol demonstrates feasibility of regulating gene expression levels. In the difficultproblem of parameter estimation, generalized EM algorithm is deployed, and a set of explicit formula based on extended Kalman filter is derived. Application of themethod to synthetic and real world data shows promising results

    Dynamical pathway analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although a great deal is known about one gene or protein and its functions under different environmental conditions, little information is available about the complex behaviour of biological networks subject to different environmental perturbations. Observing differential expressions of one or more genes between normal and abnormal cells has been a mainstream method of discovering pertinent genes in diseases and therefore valuable drug targets. However, to date, no such method exists for elucidating and quantifying the differential dynamical behaviour of genetic regulatory networks, which can have greater impact on phenotypes than individual genes.</p> <p>Results</p> <p>We propose to redress the deficiency by formulating the functional study of biological networks as a control problem of dynamical systems. We developed mathematical methods to study the stability, the controllability, and the steady-state behaviour, as well as the transient responses of biological networks under different environmental perturbations. We applied our framework to three real-world datasets: the SOS DNA repair network in <it>E. coli </it>under different dosages of radiation, the GSH redox cycle in mice lung exposed to either poisonous air or normal air, and the MAPK pathway in mammalian cell lines exposed to three types of HIV type I Vpr, a wild type and two mutant types; and we found that the three genetic networks exhibited fundamentally different dynamical properties in normal and abnormal cells.</p> <p>Conclusion</p> <p>Difference in stability, relative stability, degrees of controllability, and transient responses between normal and abnormal cells means considerable difference in dynamical behaviours and different functioning of cells. Therefore differential dynamical properties can be a valuable tool in biomedical research.</p

    Efficient and Robust Algorithms for Statistical Inference in Gene Regulatory Networks

    Get PDF
    Inferring gene regulatory networks (GRNs) is of profound importance in the field of computational biology and bioinformatics. Understanding the gene-gene and gene- transcription factor (TF) interactions has the potential of providing an insight into the complex biological processes taking place in cells. High-throughput genomic and proteomic technologies have enabled the collection of large amounts of data in order to quantify the gene expressions and mapping DNA-protein interactions. This dissertation investigates the problem of network component analysis (NCA) which estimates the transcription factor activities (TFAs) and gene-TF interactions by making use of gene expression and Chip-chip data. Closed-form solutions are provided for estimation of TF-gene connectivity matrix which yields advantage over the existing state-of-the-art methods in terms of lower computational complexity and higher consistency. We present an iterative reweighted ℓ2 norm based algorithm to infer the network connectivity when the prior knowledge about the connections is incomplete. We present an NCA algorithm which has the ability to counteract the presence of outliers in the gene expression data and is therefore more robust. Closed-form solutions are derived for the estimation of TFAs and TF-gene interactions and the resulting algorithm is comparable to the fastest algorithms proposed so far with the additional advantages of robustness to outliers and higher reliability in the TFA estimation. Finally, we look at the inference of gene regulatory networks which which essentially resumes to the estimation of only the gene-gene interactions. Gene networks are known to be sparse and therefore an inference algorithm is proposed which imposes a sparsity constraint while estimating the connectivity matrix.The online estimation lowers the computational complexity and provides superior performance in terms of accuracy and scalability. This dissertation presents gene regulatory network inference algorithms which provide computationally efficient solutions in some very crucial scenarios and give advantage over the existing algorithms and therefore provide means to give better understanding of underlying cellular network. Hence, it serves as a building block in the accurate estimation of gene regulatory networks which will pave the way for finding cures to genetic diseases

    Microarray Data Mining and Gene Regulatory Network Analysis

    Get PDF
    The novel molecular biological technology, microarray, makes it feasible to obtain quantitative measurements of expression of thousands of genes present in a biological sample simultaneously. Genome-wide expression data generated from this technology are promising to uncover the implicit, previously unknown biological knowledge. In this study, several problems about microarray data mining techniques were investigated, including feature(gene) selection, classifier genes identification, generation of reference genetic interaction network for non-model organisms and gene regulatory network reconstruction using time-series gene expression data. The limitations of most of the existing computational models employed to infer gene regulatory network lie in that they either suffer from low accuracy or computational complexity. To overcome such limitations, the following strategies were proposed to integrate bioinformatics data mining techniques with existing GRN inference algorithms, which enables the discovery of novel biological knowledge. An integrated statistical and machine learning (ISML) pipeline was developed for feature selection and classifier genes identification to solve the challenges of the curse of dimensionality problem as well as the huge search space. Using the selected classifier genes as seeds, a scale-up technique is applied to search through major databases of genetic interaction networks, metabolic pathways, etc. By curating relevant genes and blasting genomic sequences of non-model organisms against well-studied genetic model organisms, a reference gene regulatory network for less-studied organisms was built and used both as prior knowledge and model validation for GRN reconstructions. Networks of gene interactions were inferred using a Dynamic Bayesian Network (DBN) approach and were analyzed for elucidating the dynamics caused by perturbations. Our proposed pipelines were applied to investigate molecular mechanisms for chemical-induced reversible neurotoxicity

    Data analysis methods for copy number discovery and interpretation

    Get PDF
    Copy number variation (CNV) is an important type of genetic variation that can give rise to a wide variety of phenotypic traits. Differences in copy number are thought to play major roles in processes that involve dosage sensitive genes, providing beneficial, deleterious or neutral modifications to individual phenotypes. Copy number analysis has long been a standard in clinical cytogenetic laboratories. Gene deletions and duplications can often be linked with genetic Syndromes such as: the 7q11.23 deletion of Williams-­‐Bueren Syndrome, the 22q11 deletion of DiGeorge syndrome and the 17q11.2 duplication of Potocki-­‐Lupski syndrome. Interestingly, copy number based genomic disorders often display reciprocal deletion / duplication syndromes, with the latter frequently exhibiting milder symptoms. Moreover, the study of chromosomal imbalances plays a key role in cancer research. The datasets used for the development of analysis methods during this project are generated as part of the cutting-­‐edge translational project, Deciphering Developmental Disorders (DDD). This project, the DDD, is the first of its kind and will directly apply state of the art technologies, in the form of ultra-­‐high resolution microarray and next generation sequencing (NGS), to real-­‐time genetic clinical practice. It is collaboration between the Wellcome Trust Sanger Institute (WTSI) and the National Health Service (NHS) involving the 24 regional genetic services across the UK and Ireland. Although the application of DNA microarrays for the detection of CNVs is well established, individual change point detection algorithms often display variable performances. The definition of an optimal set of parameters for achieving a certain level of performance is rarely straightforward, especially where data qualities vary ... [cont.]

    Efficient and Robust Algorithms for Statistical Inference in Gene Regulatory Networks

    Get PDF
    Inferring gene regulatory networks (GRNs) is of profound importance in the field of computational biology and bioinformatics. Understanding the gene-gene and gene- transcription factor (TF) interactions has the potential of providing an insight into the complex biological processes taking place in cells. High-throughput genomic and proteomic technologies have enabled the collection of large amounts of data in order to quantify the gene expressions and mapping DNA-protein interactions. This dissertation investigates the problem of network component analysis (NCA) which estimates the transcription factor activities (TFAs) and gene-TF interactions by making use of gene expression and Chip-chip data. Closed-form solutions are provided for estimation of TF-gene connectivity matrix which yields advantage over the existing state-of-the-art methods in terms of lower computational complexity and higher consistency. We present an iterative reweighted ℓ2 norm based algorithm to infer the network connectivity when the prior knowledge about the connections is incomplete. We present an NCA algorithm which has the ability to counteract the presence of outliers in the gene expression data and is therefore more robust. Closed-form solutions are derived for the estimation of TFAs and TF-gene interactions and the resulting algorithm is comparable to the fastest algorithms proposed so far with the additional advantages of robustness to outliers and higher reliability in the TFA estimation. Finally, we look at the inference of gene regulatory networks which which essentially resumes to the estimation of only the gene-gene interactions. Gene networks are known to be sparse and therefore an inference algorithm is proposed which imposes a sparsity constraint while estimating the connectivity matrix.The online estimation lowers the computational complexity and provides superior performance in terms of accuracy and scalability. This dissertation presents gene regulatory network inference algorithms which provide computationally efficient solutions in some very crucial scenarios and give advantage over the existing algorithms and therefore provide means to give better understanding of underlying cellular network. Hence, it serves as a building block in the accurate estimation of gene regulatory networks which will pave the way for finding cures to genetic diseases

    Graphical models for estimating dynamic networks

    Get PDF
    Het bepalen van dynamische netwerken met behulp van data is een actief onderzoeksgebied, met name in de systeem biologie. Het schatten van de structuur van een netwerk heeft te maken met het bepalen van de aan of afwezigheid van een relatie tussen de hoekpunten in de graaf. Grafische modellen definiëren deze relaties via conditionele afhankelijkheid. In Gaussiaanse grafische modellen (GGM) wordt verondersteld dat de hoekpunten een normale verdeling volgen. Dit heeft grote voordelen vanwege de computationele handelbaarheid van GGM. Standaard GGM zijn echter niet bruikbaar om grote netwerken te bestuderen, i.e. als het aantal waarnemingen minder is dan het aantal hoekpunten van de graaf. Recentelijk zijn bestrafde meest waarschijnlijke schatters voorgesteld om toch met hoog-dimensionale situaties om te kunnen gaan. We stellen voor om bestrafde GGM te gebruiken in een aantal verschillende contexten: voor gestruktureerde dynamische modellen, voor langzaam veranderende dynamische modellen en voor modellen met een bepaalde structuur, zoals bijvoorbeeld met een “kleine wereld” architectuur. Elk van deze modellen kan worden toegepast in echte, hoog-dimensionale situaties waar de ontwikkeling van het netwerk een belangrijke rol speelt. Zodra het onderliggend process op de hoekpunten binaire variabelen, ordinale variabelen, tellingen of op andere wijze niet-normale data zijn, stellen we in dit proefschrift voor om via een Gaussiaanse copula een algemeen niet-Gaussiaanse grafisch model te definiëren. Deze copula transformeert de data of direct via de marginale verdelingsfunctie van de variabelen, of indirect via een latente normale variabelen. Deze aanpak is zeer successful, met name omdat het op eenvoudige wijze variabelen van verschillende typen samen kan modelleren in een grafisch model. Het probleem van het schatten van een dynamisch network wordt nog moeilijker als een bepaald deel van de hoekpunten niet waargenomen zijn. In zulke gevallen worden typisch state-space modellen gebruikt, maar hier stellen we voor om een uitbreiding van onze bestrafde grafische model te gebruiken om het latente deel van het netwerk te schatten
    corecore