Search CORE

150 research outputs found

Wisdom of crowds for robust gene network inference

Author: DREAM5 Consortium
Holmes Benjamin Ray
Kellis Manolis
Marbach Daniel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2011
Field of study

Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ~1,700 transcriptional interactions at a precision of ~50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.National Institutes of Health (U.S.)National Centers for Biomedical Computing (U.S.) (Roadmap Initiative (U54CA121852))Howard Hughes Medical InstituteNational Institutes of Health (U.S.) (Director's Pioneer Award DPI OD003644)Swiss National Science Foundation (Fellowship

DSpace@MIT

Bayesian Covariate-Dependent Quantile Directed Acyclic Graphical Models for Individualized Inference

Author: Baladandayuthapani Veerabhadran
Bhadra Anindya
Ni Yang
Sagar Ksheera
Publication venue
Publication date: 22/05/2023
Field of study

We propose an approach termed ``qDAGx'' for Bayesian covariate-dependent quantile directed acyclic graphs (DAGs) where these DAGs are individualized, in the sense that they depend on individual-specific covariates. The individualized DAG structure of the proposed approach can be uniquely identified at any given quantile, based on purely observational data without strong assumptions such as a known topological ordering. To scale the proposed method to a large number of variables and covariates, we propose for the model parameters a novel parameter expanded horseshoe prior that affords a number of attractive theoretical and computational benefits to our approach. By modeling the conditional quantiles, qDAGx overcomes the common limitations of mean regression for DAGs, which can be sensitive to the choice of likelihood, e.g., an assumption of multivariate normality, as well as to the choice of priors. We demonstrate the performance of qDAGx through extensive numerical simulations and via an application in precision medicine, which infers patient-specific protein--protein interaction networks in lung cancer.Comment: 35 pages, 5 figure

arXiv.org e-Print Archive

Inferring bi-directional interactions between circadian clock genes and metabolism with model ensembles

Author: Aderhold
Aderhold
Ahmed
Ahmed
Andrej Aderhold
Barenco
Barenco
Beal
Beal
Beal
Beal
Bläsing
Bläsing
Brandt
Brandt
Chevaleyre
Chevaleyre
Chib
Chib
Ciocchetta
Ciocchetta
Dalchau
Dalchau
Davies
Davies
Dirk Husmeier
Feugier
Feugier
Flis
Flis
Friedman
Friedman
Geigenberger
Geigenberger
Geiger
Geiger
Gille
Gille
Graf
Graf
Grzegorczyk
Grzegorczyk
Grzegorczyk
Grzegorczyk
Guerriero
Guerriero
Hanley
Hanley
Hastie
Hastie
Haydon
Haydon
Herrero
Herrero
Kalaitzis
Kalaitzis
Kikis
Kikis
Ko
Ko
Kolmos
Kolmos
Kuncheva
Kuncheva
Kusano
Kusano
Lawrence
Lawrence
Locke
Locke
Marbach
Marbach
Marco Grzegorczyk
Margolin
Margolin
Morrissey
Morrissey
Pokhilko
Pokhilko
Pokhilko
Pokhilko
Polikar
Polikar
Rasmussen
Rasmussen
Rasmussen
Rasmussen
Rogers
Rogers
Solak
Solak
Strimmer
Strimmer
Tibshirani
Tibshirani
Wang
Wang
Äijö
Äijö
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

Crossref

Practical Approaches to Biological Network Discovery

Author: Haynes Brian
Publication venue: Washington University Open Scholarship
Publication date: 24/05/2012
Field of study

This dissertation addresses a current outstanding problem in the field of systems biology, which is to identify the structure of a transcriptional network from high-throughput experimental data. Understanding of the connectivity of a transcriptional network is an important piece of the puzzle, which relates the genotype of an organism to its phenotypes. An overwhelming number of computational approaches have been proposed to perform integrative analyses on large collections of high-throughput gene expression datasets to infer the structure of transcriptional networks. I put forth a methodology by which these tools can be evaluated and compared against one another to better understand their strengths and weaknesses. Next I undertake the task of utilizing high-throughput datasets to learn new and interesting network biology in the pathogenic fungus Cryptococcus neoformans. Finally I propose a novel computational method for mapping out transcriptional networks that unifies two orthogonal strategies for network inference. I apply this method to map out the transcriptional network of Saccharomyces cerevisiae and demonstrate how network inference results can complement chromatin immunoprecipitation: ChIP) experiments, which directly probe the binding events of transcriptional regulators. Collectively, my contributions improve both the accessibility and practicality of network inference methods

Washington University St. Louis: Open Scholarship

Inferential stability in systems biology

Author: Kirk Paul
Kirk Paul
Publication venue: Division of Molecular Biosciences, Imperial College London
Publication date: 01/03/2011
Field of study

The modern biological sciences are fraught with statistical difficulties. Biomolecular stochasticity, experimental noise, and the “large p, small n” problem all contribute to the challenge of data analysis. Nevertheless, we routinely seek to draw robust, meaningful conclusions from observations. In this thesis, we explore methods for assessing the effects of data variability upon downstream inference, in an attempt to quantify and promote the stability of the inferences we make. We start with a review of existing methods for addressing this problem, focusing upon the bootstrap and similar methods. The key requirement for all such approaches is a statistical model that approximates the data generating process. We move on to consider biomarker discovery problems. We present a novel algorithm for proposing putative biomarkers on the strength of both their predictive ability and the stability with which they are selected. In a simulation study, we find our approach to perform favourably in comparison to strategies that select on the basis of predictive performance alone. We then consider the real problem of identifying protein peak biomarkers for HAM/TSP, an inflammatory condition of the central nervous system caused by HTLV-1 infection. We apply our algorithm to a set of SELDI mass spectral data, and identify a number of putative biomarkers. Additional experimental work, together with known results from the literature, provides corroborating evidence for the validity of these putative biomarkers. Having focused on static observations, we then make the natural progression to time course data sets. We propose a (Bayesian) bootstrap approach for such data, and then apply our method in the context of gene network inference and the estimation of parameters in ordinary differential equation models. We find that the inferred gene networks are relatively unstable, and demonstrate the importance of finding distributions of ODE parameter estimates, rather than single point estimates

Spiral - Imperial College Digital Repository

Recommended from our members

Novel methods for biological network inference: an application to circadian Ca2+ signaling network

Author: Jin Junyang
Publication venue: University of Cambridge
Publication date: 16/11/2018
Field of study

Biological processes involve complex biochemical interactions among a large number of species like cells, RNA, proteins and metabolites. Learning these interactions is essential to interfering artificially with biological processes in order to, for example, improve crop yield, develop new therapies, and predict new cell or organism behaviors to genetic or environmental perturbations. For a biological process, two pieces of information are of most interest. For a particular species, the first step is to learn which other species are regulating it. This reveals topology and causality. The second step involves learning the precise mechanisms of how this regulation occurs. This step reveals the dynamics of the system. Applying this process to all species leads to the complete dynamical network. Systems biology is making considerable efforts to learn biological networks at low experimental costs. The main goal of this thesis is to develop advanced methods to build models for biological networks, taking the circadian system of Arabidopsis thaliana as a case study. A variety of network inference approaches have been proposed in the literature to study dynamic biological networks. However, many successful methods either require prior knowledge of the system or focus more on topology. This thesis presents novel methods that identify both network topology and dynamics, and do not depend on prior knowledge. Hence, the proposed methods are applicable to general biological networks. These methods are initially developed for linear systems, and, at the cost of higher computational complexity, can also be applied to nonlinear systems. Overall, we propose four methods with increasing computational complexity: one-to-one, combined group and element sparse Bayesian learning (GESBL), the kernel method and reversible jump Markov chain Monte Carlo method (RJMCMC). All methods are tested with challenging dynamical network simulations (including feedback, random networks, different levels of noise and number of samples), and realistic models of circadian system of Arabidopsis thaliana. These simulations show that, while the one-to-one method scales to the whole genome, the kernel method and RJMCMC method are superior for smaller networks. They are robust to tuning variables and able to provide stable performance. The simulations also imply the advantage of GESBL and RJMCMC over the state-of-the-art method. We envision that the estimated models can benefit a wide range of research. For example, they can locate biological compounds responsible for human disease through mathematical analysis and help predict the effectiveness of new treatments

Apollo (Cambridge)

Recommended from our members

Understanding transcriptional regulation through computational analysis of single-cell transcriptomics

Author: Lim Chee Yee
Publication venue: University of Cambridge
Publication date: 08/10/2017
Field of study

Gene expression is tightly regulated by complex transcriptional regulatory mechanisms to achieve specific expression patterns, which are essential to facilitate important biological processes such as embryonic development. Dysregulation of gene expression can lead to diseases such as cancers. A better understanding of the transcriptional regulation will therefore not only advance the understanding of fundamental biological processes, but also provide mechanistic insights into diseases. The earlier versions of high-throughput expression profiling techniques were limited to measuring average gene expression across large pools of cells. In contrast, recent technological improvements have made it possible to perform expression profiling in single cells. Single-cell expression profiling is able to capture heterogeneity among single cells, which is not possible in conventional bulk expression profiling. In my PhD, I focus on developing new algorithms, as well as benchmarking and utilising existing algorithms to study the transcriptomes of various biological systems using single-cell expression data. I have developed two different single-cell specific network inference algorithms, BTR and SPVAR, which are based on two different formalisms, Boolean and autoregression frameworks respectively. BTR was shown to be useful for improving existing Boolean models with single-cell expression data, while SPVAR was shown to be a conservative predictor of gene interactions using pseudotime-ordered single-cell expression data. In addition, I have obtained novel biological insights by analysing single-cell RNAseq data from the epiblast stem cells reprogramming and the leukaemia systems. Three different driver genes, namely Esrrb, Klf2 and GY118F, were shown to drive reprogramming of epiblast stem cells via different reprogramming routes. As for the leukaemia system, FLT3-ITD and IDH1-R132H mutations were shown to interact with each other and potentially predispose some cells for developing acute myeloid leukaemia.Wellcome Trust and Cambridge Trus

Apollo (Cambridge)

Statistical inference in mechanistic models: time warping for improved gradient matching

Author: Filippone Maurizio
Husmeier Dirk
Macdonald Benn
Niu Mu
Rogers Simon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2018
Field of study

Inference in mechanistic models of non-linear differential equations is a challenging problem in current computational statistics. Due to the high computational costs of numerically solving the differential equations in every step of an iterative parameter adaptation scheme, approximate methods based on gradient matching have become popular. However, these methods critically depend on the smoothing scheme for function interpolation. The present article adapts an idea from manifold learning and demonstrates that a time warping approach aiming to homogenize intrinsic length scales can lead to a significant improvement in parameter estimation accuracy. We demonstrate the effectiveness of this scheme on noisy data from two dynamical systems with periodic limit cycle, a biopathway, and an application from soft-tissue mechanics. Our study also provides a comparative evaluation on a wide range of signal-to-noise ratios

Enlighten

Penalized estimation in high-dimensional data analysis

Author: Rooi Jacob
Publication venue: Erasmus Universiteit Rotterdam (EUR)
Publication date: 24/10/2013
Field of study

EUR Research Repository