66 research outputs found
Dynamic Bayesian networks in molecular plant science: inferring gene regulatory networks from multiple gene expression time series
To understand the processes of growth and biomass production in plants, we ultimately need to elucidate the structure of the underlying regulatory networks at the molecular level. The advent of high-throughput postgenomic technologies has spurred substantial interest in reverse engineering these networks from data, and several techniques from machine learning and multivariate statistics have recently been proposed. The present article discusses the problem of inferring gene regulatory networks from gene expression time series, and we focus our exposition on the methodology of Bayesian networks. We describe dynamic Bayesian networks and explain their advantages over other statistical methods. We introduce a novel information sharing scheme, which allows us to infer gene regulatory networks from multiple sources of gene expression data more accurately. We illustrate and test this method on a set of synthetic data, using three different measures to quantify the network reconstruction accuracy. The main application of our method is related to the problem of circadian regulation in plants, where we aim to reconstruct the regulatory networks of nine circadian genes in Arabidopsis thaliana from four gene expression time series obtained under different experimental conditions
Heterogeneous continuous dynamic Bayesian networks with flexible structure and inter-time segment information sharing
Classical dynamic Bayesian networks (DBNs) are based on the homogeneous Markov assumption and cannot deal with heterogeneity and non-stationarity in temporal processes. Various approaches to relax the homogeneity assumption have recently been proposed. The present paper aims to improve the shortcomings of three recent versions of heterogeneous DBNs along the following lines: (i) avoiding the need for data discretization, (ii) increasing the flexibility over a time-invariant network structure, (iii) avoiding over-flexibility and overfitting by introducing a regularization scheme based in inter-time segment information sharing. The improved method is evaluated on synthetic data and compared with alternative published methods on gene expression time series from Drosophila melanogaster. 1
Inference in complex biological systems with Gaussian processes and parallel tempering
Parameter inference in mathematical models of complex biological
systems, expressed as coupled ordinary differential equations (ODEs), is a challenging problem. These depend on kinetic parameters, which cannot all be measured and have to be ascertained a different way. However, the computational
costs associated with repeatedly solving the ODEs are often staggering, making
many techniques impractical. Therefore, aimed at reducing this cost, new concepts using gradient matching have been proposed. This paper combines current
adaptive gradient matching approaches, using Gaussian processes, with a parallel tempering scheme, in order to compare 2 different paradigms using the same
nonlinear regression method. We use 2 ODE systems to assess our technique,
showing an improvement over the recent method in Calderhead et al. (2008)
ODE parameter inference using adaptive gradient matching with Gaussian processes
Parameter inference in mechanistic models based on systems of coupled differential equa- tions is a topical yet computationally chal- lenging problem, due to the need to fol- low each parameter adaptation with a nu- merical integration of the differential equa- tions. Techniques based on gradient match- ing, which aim to minimize the discrepancy between the slope of a data interpolant and the derivatives predicted from the differen- tial equations, offer a computationally ap- pealing shortcut to the inference problem. The present paper discusses a method based on nonparametric Bayesian statistics with Gaussian processes due to Calderhead et al. (2008), and shows how inference in this model can be substantially improved by consistently inferring all parameters from the joint dis- tribution. We demonstrate the efficiency of our adaptive gradient matching technique on three benchmark systems, and perform a de- tailed comparison with the method in Calder- head et al. (2008) and the explicit ODE inte- gration approach, both in terms of parameter inference accuracy and in terms of computa- tional efficiency
Parameter inference in mechanistic models of cellular regulation and signalling pathways using gradient matching
A challenging problem in systems biology is parameter inference in mechanistic models of signalling pathways. In the present article, we investigate an approach based on gradient matching and nonparametric Bayesian modelling with Gaussian processes. We evaluate the method on two biological systems, related to the regulation of PIF4/5 in Arabidopsis thaliana, and the JAK/STAT signal transduction pathway
TOPALi v2: a rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops
Summary: TOPALi v2 simplifies and automates the use of several methods for the evolutionary analysis of multiple sequence alignments. Jobs are submitted from a Java graphical user interface as TOPALi web services to either run remotely on high-performance computing clusters or locally (with multiple cores supported). Methods available include model selection and phylogenetic tree estimation using the Bayesian inference and maximum likelihood (ML) approaches, in addition to recombination detection methods. The optimal substitution model can be selected for protein or nucleic acid (standard, or protein-coding using a codon position model) data using accurate statistical criteria derived from ML co-estimation of the tree and the substitution model. Phylogenetic software available includes PhyML, RAxML and MrBayes
Reachability in Parametric Interval Markov Chains using Constraints
Parametric Interval Markov Chains (pIMCs) are a specification formalism that
extend Markov Chains (MCs) and Interval Markov Chains (IMCs) by taking into
account imprecision in the transition probability values: transitions in pIMCs
are labeled with parametric intervals of probabilities. In this work, we study
the difference between pIMCs and other Markov Chain abstractions models and
investigate the two usual semantics for IMCs: once-and-for-all and
at-every-step. In particular, we prove that both semantics agree on the
maximal/minimal reachability probabilities of a given IMC. We then investigate
solutions to several parameter synthesis problems in the context of pIMCs --
consistency, qualitative reachability and quantitative reachability -- that
rely on constraint encodings. Finally, we propose a prototype implementation of
our constraint encodings with promising results
Recommended from our members
Prediction of cavitation and induced erosion inside a high-pressure fuel pump
The operation of a high-pressure, piston-plunger fuel pump, oriented for use in the common rail circuit of modern Diesel engines for providing fuel to the injectors is investigated in the present study from a numerical perspective. Both the suction and pressurization phases of the pump stroke were simulated with the overall flow time be-ing in the order of 12•10-3 s. The topology of the cavitating flow within the pump con-figuration was captured through the use of an Equation of State (EoS) implemented in the framework of a barotropic, homogeneous equilibrium model. Cavitation was found to set in within the pressure chamber as early as 0.2•10-3 s in the operating cycle, while the minimum liquid volume fraction detected was in the order of 60% during the sec-ond period of the valve opening. Increase of the in-cylinder pressure during the final stages of the pumping stroke lead to the collapse of the previously arisen cavitation structures and three layout locations, namely the piston edge, the valve/valve-seat re-gion and the outlet orifice, were identified as vulnerable to cavitation-induced erosion through the use of cavitation-aggressiveness indicators
Phylogenetic Detection of Recombination with a Bayesian Prior on the Distance between Trees
Genomic regions participating in recombination events may support distinct topologies, and phylogenetic analyses should incorporate this heterogeneity. Existing phylogenetic methods for recombination detection are challenged by the enormous number of possible topologies, even for a moderate number of taxa. If, however, the detection analysis is conducted independently between each putative recombinant sequence and a set of reference parentals, potential recombinations between the recombinants are neglected. In this context, a recombination hotspot can be inferred in phylogenetic analyses if we observe several consecutive breakpoints. We developed a distance measure between unrooted topologies that closely resembles the number of recombinations. By introducing a prior distribution on these recombination distances, a Bayesian hierarchical model was devised to detect phylogenetic inconsistencies occurring due to recombinations. This model relaxes the assumption of known parental sequences, still common in HIV analysis, allowing the entire dataset to be analyzed at once. On simulated datasets with up to 16 taxa, our method correctly detected recombination breakpoints and the number of recombination events for each breakpoint. The procedure is robust to rate and transition∶transversion heterogeneities for simulations with and without recombination. This recombination distance is related to recombination hotspots. Applying this procedure to a genomic HIV-1 dataset, we found evidence for hotspots and de novo recombination
- …