451 research outputs found
Bayesian regularization of hidden Markov models with an application to bioinformatics
This paper discusses a Bayesian approach to regularizing hidden Markov models and demonstrates an application of this scheme to Bioinformatics
Reverse engineering of genetic networks with Bayesian networks
This paper provides a brief introduction to learning Bayesian networks from gene-expression data. The method is contrasted with other approaches to the reverse engineering of biochemical networks, and the Bayesian learning paradigm is briefly described. The article demonstrates an application to a simple synthetic toy problem and evaluates the inference performance in terms of ROC (receiver operator characteristic) curves
Modelling transcriptional regulation with Gaussian processes
A challenging problem in systems biology is the quantitative modelling
of transcriptional regulation. Transcription factors (TFs), which are the
key proteins at the centre of the regulatory processes, may be subject
to post-translational modification, rendering them unobservable at the
mRNA level, or they may be controlled outside of the subsystem being
modelled. In both cases, a mechanistic model description of the regula-
tory system needs to be able to deal with latent activity profiles of the key
regulators. A promising approach to deal with these difficulties is based
on using Gaussian processes to define a prior distribution over the latent
TF activity profiles. Inference is based on the principles of non-parametric
Bayesian statistics, consistently inferring the posterior distribution of the
unknown TF activities from the observed expression levels of potential
target genes. The present work provides explicit solutions to the differ-
ential equations needed to model the data in this manner, as well as the
derivatives needed for effective optimisation. The work further explores
identifiability issues not fully shown in previous work and looks at how
this can cause difficulties with inference. We subsequently look at how the
method works on two different TFs, including looking at how the model
works with a more biologically realistic mechanistic model. Finally we
analyse the effect of more biologically realistic non-Gaussian noise on the
biologically realistic model showing how this can cause a reduction in the
accuracy of the inference
Statistical Modelling of Cell Movement
In this paper we demonstrate an application of the unscented Kalman filter in the context of cell movement, using a model defined in terms of stochastic differential equations (SDEs)
Bayesian regularization of non-homogeneous dynamic Bayesian networks by globally coupling interaction parameters
To relax the homogeneity assumption of classical dynamic Bayesian networks (DBNs), various recent studies have combined DBNs with multiple changepoint processes. The underlying assumption is that the parameters associated with time series segments delimited by multiple changepoints are a priori independent. Under weak regularity conditions, the parameters can be integrated out in the likelihood, leading to a closed-form expression of the marginal likelihood. However, the assumption of prior independence is unrealistic in many real-world applications, where the segment-specific regulatory relationships among the interdependent quantities tend to undergo gradual evolutionary adaptations. We therefore propose a Bayesian coupling scheme to introduce systematic information sharing among the segment-specific interaction parameters. We investigate the effect this model improvement has on the network reconstruction accuracy in a reverse engineering context, where the objective is to learn the structure of a gene regulatory network from temporal gene expression profiles
Gradient matching methods for computational inference in mechanistic models for systems biology: a review and comparative analysis
Parameter inference in mathematical models of biological pathways, expressed as coupled ordinary differential equations (ODEs), is a challenging problem in contemporary systems biology. Conventional methods involve repeatedly solving the ODEs by numerical integration, which is computationally onerous and does not scale up to complex systems. Aimed at reducing the computational costs, new concepts based on gradient matching have recently been proposed in the computational statistics and machine learning literature. In a preliminary smoothing step, the time series data are interpolated; then, in a second step, the parameters of the ODEs are optimised so as to minimise some metric measuring the difference between the slopes of the tangents to the interpolants, and the time derivatives from the ODEs. In this way, the ODEs never have to be solved explicitly. This review provides a concise methodological overview of the current state-of-the-art methods for gradient matching in ODEs, followed by an empirical comparative evaluation based on a set of widely used and representative benchmark data
Inference in Nonlinear Systems with Unscented Kalman Filters
An increasing number of scientific disciplines, most notably the life sciences and
health care, have become more quantitative, describing complex systems with coupled nonlinear
diāµerential equations. While powerful algorithms for numerical simulations from these systems
have been developed, statistical inference of the system parameters is still a challenging problem.
A promising approach is based on the unscented Kalman filter (UKF), which has seen
a variety of recent applications, from soft tissue mechanics to chemical kinetics. The present
study investigates the dependence of the accuracy of parameter estimation on the initialisation.
Based on three toy systems that capture typical features of real-world complex systems: limit
cycles, chaotic attractors and intrinsic stochasticity, we carry out repeated simulations on a large
range of independent data instantiations. Our study allows a quantification of the accuracy of
inference, measured in terms of two alternative distance measures in function and parameter
space, in dependence on the initial deviation from the ground truth
A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology
An important and challenging problem in systems biology is the inference of gene regulatory networks from short non-stationary time series of transcriptional profiles. A popular approach that has been widely applied to this end is based on dynamic Bayesian networks (DBNs), although traditional homogeneous DBNs fail to model the non-stationarity and time-varying nature of the gene regulatory processes. Various authors have therefore recently proposed combining DBNs with multiple changepoint processes to obtain time varying dynamic Bayesian networks (TV-DBNs). However, TV-DBNs are not without problems. Gene expression time series are typically short, which leaves the model over-flexible, leading to over-fitting or inflated inference uncertainty. In the present paper, we introduce a Bayesian regularization scheme that addresses this difficulty. Our approach is based on the rationale that changes in gene regulatory processes appear gradually during an organism's life cycle or in response to a changing environment, and we have integrated this notion in the prior distribution of the TV-DBN parameters. We have extensively tested our regularized TV-DBN model on synthetic data, in which we have simulated short non-homogeneous time series produced from a system subject to gradual change. We have then applied our method to real-world gene expression time series, measured during the life cycle of Drosophila melanogaster, under artificially generated constant light condition in Arabidopsis thaliana, and from a synthetically designed strain of Saccharomyces cerevisiae exposed to a changing environment
Addressing the shortcomings of three recent bayesian methods for detecting interspecific recombination in DNA sequence alignments
We address a potential shortcoming of three probabilistic models for detecting interspecific recombination in DNA sequence alignments: the multiple change-point model (MCP) of Suchard et al. (2003), the dual multiple change-point model (DMCP) of Minin et al. (2005), and the phylogenetic factorial hidden Markov model (PFHMM) of Husmeier (2005). These models are based on the Bayesian paradigm, which requires the solution of an integral over the space of branch lengths. To render this integration analytically tractable, all three models make the same assumption that the vectors of branch lengths of the phylogenetic tree are independent among sites. While this approximation reduces the computational complexity considerably, we show that it leads to the systematic prediction of spurious topology changes in the Felsenstein zone, that is, the area in the branch lengths configuration space where maximum parsimony consistently infers the wrong topology due to long-branch attraction. We apply two Bayesian hypothesis tests, based on an inter- and an intra-model approach to estimating the marginal likelihood. We then propose a revised model that addresses these shortcomings, and compare it with the aforementioned models on a set of synthetic DNA sequence alignments systematically generated around the Felsenstein zone
- ā¦