34 research outputs found

    Identifying targets of multiple co-regulating transcription factors from expression time-series by Bayesian model comparison

    Get PDF
    Background: Complete transcriptional regulatory network inference is a huge challenge because of the complexity of the network and sparsity of available data. One approach to make it more manageable is to focus on the inference of context-specific networks involving a few interacting transcription factors (TFs) and all of their target genes. Results: We present a computational framework for Bayesian statistical inference of target genes of multiple interacting TFs from high-throughput gene expression time-series data. We use ordinary differential equation models that describe transcription of target genes taking into account combinatorial regulation. The method consists of a training and a prediction phase. During the training phase we infer the unobserved TF protein concentrations on a subnetwork of approximately known regulatory structure. During the prediction phase we apply Bayesian model selection on a genome-wide scale and score all alternative regulatory structures for each target gene. We use our methodology to identify targets of five TFs regulating Drosophila melanogaster mesoderm development. We find that confident predicted links between TFs and targets are significantly enriched for supporting ChIP-chip binding events and annotated TF-gene interations. Our method statistically significantly outperforms existing alternatives. Conclusions: Our results show that it is possible to infer regulatory links between multiple interacting TFs and their target genes even from a single relatively short time series and in presence of unmodelled confounders and unreliable prior knowledge on training network connectivity. Introducing data from several different experimental perturbations significantly increases the accuracy

    Multi-objective optimization using Deep Gaussian Processes: Application to Aerospace Vehicle Design

    Get PDF
    International audienceThis paper is focused on the problem of constrained multi-objective design optimization of aerospace vehicles. The design of such vehicles often involves disciplinary legacy models considered as black-box and computationally expensive simulations characterized by a possible non-stationary behavior (an abrupt change in the response or a different smoothness along the design space). The expensive cost of an exact function evaluation makes the use of classical evolutionary multi-objective algorithms not tractable. While Bayesian Optimization based on Gaussian Process regression can handle the expensive cost of the evaluations, the non-stationary behavior of the functions can make it inefficient. A recent approach consisting of coupling Bayesian Optimization with Deep Gaussian Processes showed promising results for single-objective non-stationary problems. This paper presents an extension of this approach to the multi-objective context. The efficiency of the proposed approach is assessed with respect to classical optimization methods on an analytical test-case and on an aerospace design problem

    Fast parameter inference in a biomechanical model of the left ventricle by using statistical emulation

    Get PDF
    A central problem in biomechanical studies of personalized human left ventricular modelling is estimating the material properties and biophysical parameters from in vivo clinical measurements in a timeframe that is suitable for use within a clinic. Understanding these properties can provide insight into heart function or dysfunction and help to inform personalized medicine. However, finding a solution to the differential equations which mathematically describe the kinematics and dynamics of the myocardium through numerical integration can be computationally expensive. To circumvent this issue, we use the concept of emulation to infer the myocardial properties of a healthy volunteer in a viable clinical timeframe by using in vivo magnetic resonance image data. Emulation methods avoid computationally expensive simulations from the left ventricular model by replacing the biomechanical model, which is defined in terms of explicit partial differential equations, with a surrogate model inferred from simulations generated before the arrival of a patient, vastly improving computational efficiency at the clinic. We compare and contrast two emulation strategies: emulation of the computational model outputs and emulation of the loss between the observed patient data and the computational model outputs. These strategies are tested with two interpolation methods, as well as two loss functions. The best combination of methods is found by comparing the accuracy of parameter inference on simulated data for each combination. This combination, using the output emulation method, with local Gaussian process interpolation and the Euclidean loss function, provides accurate parameter inference in both simulated and clinical data, with a reduction in the computational cost of about three orders of magnitude compared with numerical integration of the differential equations by using finite element discretization techniques

    The Variational Garrote

    Get PDF
    In this paper, we present a new variational method for sparse regression using L0L_0 regularization. The variational parameters appear in the approximate model in a way that is similar to Breiman's Garrote model. We refer to this method as the variational Garrote (VG). We show that the combination of the variational approximation and L0L_0 regularization has the effect of making the problem effectively of maximal rank even when the number of samples is small compared to the number of variables. The VG is compared numerically with the Lasso method, ridge regression and the recently introduced paired mean field method (PMF) (M. Titsias & M. L\'azaro-Gredilla., NIPS 2012). Numerical results show that the VG and PMF yield more accurate predictions and more accurately reconstruct the true model than the other methods. It is shown that the VG finds correct solutions when the Lasso solution is inconsistent due to large input correlations. Globally, VG is significantly faster than PMF and tends to perform better as the problems become denser and in problems with strongly correlated inputs. The naive implementation of the VG scales cubic with the number of features. By introducing Lagrange multipliers we obtain a dual formulation of the problem that scales cubic in the number of samples, but close to linear in the number of features.Comment: 26 pages, 11 figure

    Scalable inference for a full multivariate stochastic volatility model

    No full text
    We introduce a multivariate stochastic volatility model that imposes no restrictions on the structure of the volatility matrix and treats all its elements as functions of latent stochastic processes. Inference is achieved via a carefully designed feasible and scalable MCMC that has quadratic, rather than cubic, computational complexity for evaluating the multivariate normal densities required. We illustrate how our model can be applied on macroeconomic applications through a stochastic volatility VAR model, comparing it to competing approaches in the literature. We also demonstrate how our approach can be applied to a large dataset containing 571 stock daily returns of Euro STOXX index

    Variational Bayes for High-Dimensional Linear Regression With Sparse Priors

    No full text
    We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector. The empirical performance of our algorithm is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods. We also numerically demonstrate that the widely used coordinate-ascent variational inference algorithm can be highly sensitive to the parameter updating order, leading to potentially poor performance. To mitigate this, we propose a novel prioritized updating scheme that uses a data-driven updating order and performs better in simulations. The variational algorithm is implemented in the R package sparsevb. Supplementary materials for this article are available online
    corecore