34 research outputs found
Identifying targets of multiple co-regulating transcription factors from expression time-series by Bayesian model comparison
Background: Complete transcriptional regulatory network inference is a huge challenge because of the complexity
of the network and sparsity of available data. One approach to make it more manageable is to focus on the inference
of context-specific networks involving a few interacting transcription factors (TFs) and all of their target genes.
Results: We present a computational framework for Bayesian statistical inference of target genes of multiple
interacting TFs from high-throughput gene expression time-series data. We use ordinary differential equation models
that describe transcription of target genes taking into account combinatorial regulation. The method consists of a
training and a prediction phase. During the training phase we infer the unobserved TF protein concentrations on a
subnetwork of approximately known regulatory structure. During the prediction phase we apply Bayesian model
selection on a genome-wide scale and score all alternative regulatory structures for each target gene. We use our
methodology to identify targets of five TFs regulating Drosophila melanogaster mesoderm development. We find that
confident predicted links between TFs and targets are significantly enriched for supporting ChIP-chip binding events
and annotated TF-gene interations. Our method statistically significantly outperforms existing alternatives.
Conclusions: Our results show that it is possible to infer regulatory links between multiple interacting TFs and their
target genes even from a single relatively short time series and in presence of unmodelled confounders and
unreliable prior knowledge on training network connectivity. Introducing data from several different experimental
perturbations significantly increases the accuracy
Multi-objective optimization using Deep Gaussian Processes: Application to Aerospace Vehicle Design
International audienceThis paper is focused on the problem of constrained multi-objective design optimization of aerospace vehicles. The design of such vehicles often involves disciplinary legacy models considered as black-box and computationally expensive simulations characterized by a possible non-stationary behavior (an abrupt change in the response or a different smoothness along the design space). The expensive cost of an exact function evaluation makes the use of classical evolutionary multi-objective algorithms not tractable. While Bayesian Optimization based on Gaussian Process regression can handle the expensive cost of the evaluations, the non-stationary behavior of the functions can make it inefficient. A recent approach consisting of coupling Bayesian Optimization with Deep Gaussian Processes showed promising results for single-objective non-stationary problems. This paper presents an extension of this approach to the multi-objective context. The efficiency of the proposed approach is assessed with respect to classical optimization methods on an analytical test-case and on an aerospace design problem
Fast parameter inference in a biomechanical model of the left ventricle by using statistical emulation
A central problem in biomechanical studies of personalized human left ventricular modelling is estimating the material properties and biophysical parameters from in vivo clinical measurements in a timeframe that is suitable for use within a clinic. Understanding these properties can provide insight into heart function or dysfunction and help to inform personalized medicine. However, finding a solution to the differential equations which mathematically describe the kinematics and dynamics of the myocardium through numerical integration can be computationally expensive. To circumvent this issue, we use the concept of emulation to infer the myocardial properties of a healthy volunteer in a viable clinical timeframe by using in vivo magnetic resonance image data. Emulation methods avoid computationally expensive simulations from the left ventricular model by replacing the biomechanical model, which is defined in terms of explicit partial differential equations, with a surrogate model inferred from simulations generated before the arrival of a patient, vastly improving computational efficiency at the clinic. We compare and contrast two emulation strategies: emulation of the computational model outputs and emulation of the loss between the observed patient data and the computational model outputs. These strategies are tested with two interpolation methods, as well as two loss functions. The best combination of methods is found by comparing the accuracy of parameter inference on simulated data for each combination. This combination, using the output emulation method, with local Gaussian process interpolation and the Euclidean loss function, provides accurate parameter inference in both simulated and clinical data, with a reduction in the computational cost of about three orders of magnitude compared with numerical integration of the differential equations by using finite element discretization techniques
The Variational Garrote
In this paper, we present a new variational method for sparse regression
using regularization. The variational parameters appear in the
approximate model in a way that is similar to Breiman's Garrote model. We refer
to this method as the variational Garrote (VG). We show that the combination of
the variational approximation and regularization has the effect of making
the problem effectively of maximal rank even when the number of samples is
small compared to the number of variables. The VG is compared numerically with
the Lasso method, ridge regression and the recently introduced paired mean
field method (PMF) (M. Titsias & M. L\'azaro-Gredilla., NIPS 2012). Numerical
results show that the VG and PMF yield more accurate predictions and more
accurately reconstruct the true model than the other methods. It is shown that
the VG finds correct solutions when the Lasso solution is inconsistent due to
large input correlations. Globally, VG is significantly faster than PMF and
tends to perform better as the problems become denser and in problems with
strongly correlated inputs. The naive implementation of the VG scales cubic
with the number of features. By introducing Lagrange multipliers we obtain a
dual formulation of the problem that scales cubic in the number of samples, but
close to linear in the number of features.Comment: 26 pages, 11 figure
Scalable inference for a full multivariate stochastic volatility model
We introduce a multivariate stochastic volatility model that imposes no restrictions on the structure of the volatility matrix and treats all its elements as functions of latent stochastic processes. Inference is achieved via a carefully designed feasible and scalable MCMC that has quadratic, rather than cubic, computational complexity for evaluating the multivariate normal densities required. We illustrate how our model can be applied on macroeconomic applications through a stochastic volatility VAR model, comparing it to competing approaches in the literature. We also demonstrate how our approach can be applied to a large dataset containing 571 stock daily returns of Euro STOXX index
Variational Bayes for High-Dimensional Linear Regression With Sparse Priors
We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector. The empirical performance of our algorithm is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods. We also numerically demonstrate that the widely used coordinate-ascent variational inference algorithm can be highly sensitive to the parameter updating order, leading to potentially poor performance. To mitigate this, we propose a novel prioritized updating scheme that uses a data-driven updating order and performs better in simulations. The variational algorithm is implemented in the R package sparsevb. Supplementary materials for this article are available online