2,975 research outputs found

    Assessing the impact of non-additive noise on modelling transcriptional regulation with Gaussian processes

    Get PDF
    In transcriptional regulation, transcription factors (TFs) are often unobservable at mRNA level or may be controlled outside of the system being modelled. Gaussian processes are a promising approach for dealing with these difficulties as a prior distribution can be defined over the latent TF activity profiles and the posterior distribution inferred from the observed expression levels of potential target genes. However previous approaches have been based on the assumption of additive Gaussian noise to maintain analytical tractability. We investigate the influence of a more realistic form of noise on a biologically accurate system based on Michaelis-Menten kinetics

    Modelling transcriptional regulation with Gaussian processes

    Get PDF
    A challenging problem in systems biology is the quantitative modelling of transcriptional regulation. Transcription factors (TFs), which are the key proteins at the centre of the regulatory processes, may be subject to post-translational modification, rendering them unobservable at the mRNA level, or they may be controlled outside of the subsystem being modelled. In both cases, a mechanistic model description of the regula- tory system needs to be able to deal with latent activity profiles of the key regulators. A promising approach to deal with these difficulties is based on using Gaussian processes to define a prior distribution over the latent TF activity profiles. Inference is based on the principles of non-parametric Bayesian statistics, consistently inferring the posterior distribution of the unknown TF activities from the observed expression levels of potential target genes. The present work provides explicit solutions to the differ- ential equations needed to model the data in this manner, as well as the derivatives needed for effective optimisation. The work further explores identifiability issues not fully shown in previous work and looks at how this can cause difficulties with inference. We subsequently look at how the method works on two different TFs, including looking at how the model works with a more biologically realistic mechanistic model. Finally we analyse the effect of more biologically realistic non-Gaussian noise on the biologically realistic model showing how this can cause a reduction in the accuracy of the inference

    Inferring dynamic genetic networks with low order independencies

    Full text link
    In this paper, we propose a novel inference method for dynamic genetic networks which makes it possible to face with a number of time measurements n much smaller than the number of genes p. The approach is based on the concept of low order conditional dependence graph that we extend here in the case of Dynamic Bayesian Networks. Most of our results are based on the theory of graphical models associated with the Directed Acyclic Graphs (DAGs). In this way, we define a minimal DAG G which describes exactly the full order conditional dependencies given the past of the process. Then, to face with the large p and small n estimation case, we propose to approximate DAG G by considering low order conditional independencies. We introduce partial qth order conditional dependence DAGs G(q) and analyze their probabilistic properties. In general, DAGs G(q) differ from DAG G but still reflect relevant dependence facts for sparse networks such as genetic networks. By using this approximation, we set out a non-bayesian inference method and demonstrate the effectiveness of this approach on both simulated and real data analysis. The inference procedure is implemented in the R package 'G1DBN' freely available from the CRAN archive

    Weighted-Lasso for Structured Network Inference from Time Course Data

    Full text link
    We present a weighted-Lasso method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own a prior internal structure of connectivity which drives the inference method. This prior structure can be either derived from prior biological knowledge or inferred by the method itself. We illustrate the performance of this structure-based penalization both on synthetic data and on two canonical regulatory networks, first yeast cell cycle regulation network by analyzing Spellman et al's dataset and second E. coli S.O.S. DNA repair network by analysing U. Alon's lab data

    Genome-wide discovery of modulators of transcriptional interactions in human B lymphocytes

    Full text link
    Transcriptional interactions in a cell are modulated by a variety of mechanisms that prevent their representation as pure pairwise interactions between a transcription factor and its target(s). These include, among others, transcription factor activation by phosphorylation and acetylation, formation of active complexes with one or more co-factors, and mRNA/protein degradation and stabilization processes. This paper presents a first step towards the systematic, genome-wide computational inference of genes that modulate the interactions of specific transcription factors at the post-transcriptional level. The method uses a statistical test based on changes in the mutual information between a transcription factor and each of its candidate targets, conditional on the expression of a third gene. The approach was first validated on a synthetic network model, and then tested in the context of a mammalian cellular system. By analyzing 254 microarray expression profiles of normal and tumor related human B lymphocytes, we investigated the post transcriptional modulators of the MYC proto-oncogene, an important transcription factor involved in tumorigenesis. Our method discovered a set of 100 putative modulator genes, responsible for modulating 205 regulatory relationships between MYC and its targets. The set is significantly enriched in molecules with function consistent with their activities as modulators of cellular interactions, recapitulates established MYC regulation pathways, and provides a notable repertoire of novel regulators of MYC function. The approach has broad applicability and can be used to discover modulators of any other transcription factor, provided that adequate expression profile data are available.Comment: 15 pages, 3 figures, 2 tables; minor changes following referees' comments; accepted to RECOMB0

    Identifying stochastic oscillations in single-cell live imaging time series using Gaussian processes

    Full text link
    Multiple biological processes are driven by oscillatory gene expression at different time scales. Pulsatile dynamics are thought to be widespread, and single-cell live imaging of gene expression has lead to a surge of dynamic, possibly oscillatory, data for different gene networks. However, the regulation of gene expression at the level of an individual cell involves reactions between finite numbers of molecules, and this can result in inherent randomness in expression dynamics, which blurs the boundaries between aperiodic fluctuations and noisy oscillators. Thus, there is an acute need for an objective statistical method for classifying whether an experimentally derived noisy time series is periodic. Here we present a new data analysis method that combines mechanistic stochastic modelling with the powerful methods of non-parametric regression with Gaussian processes. Our method can distinguish oscillatory gene expression from random fluctuations of non-oscillatory expression in single-cell time series, despite peak-to-peak variability in period and amplitude of single-cell oscillations. We show that our method outperforms the Lomb-Scargle periodogram in successfully classifying cells as oscillatory or non-oscillatory in data simulated from a simple genetic oscillator model and in experimental data. Analysis of bioluminescent live cell imaging shows a significantly greater number of oscillatory cells when luciferase is driven by a {\it Hes1} promoter (10/19), which has previously been reported to oscillate, than the constitutive MoMuLV 5' LTR (MMLV) promoter (0/25). The method can be applied to data from any gene network to both quantify the proportion of oscillating cells within a population and to measure the period and quality of oscillations. It is publicly available as a MATLAB package.Comment: 36 pages, 17 figure

    How to understand the cell by breaking it: network analysis of gene perturbation screens

    Get PDF
    Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio

    Inference of the genetic network regulating lateral root initiation in Arabidopsis thaliana

    Get PDF
    Regulation of gene expression is crucial for organism growth, and it is one of the challenges in Systems Biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyse two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants to infer their regulatory network. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale-free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation
    • …
    corecore