2,968 research outputs found

    Joint estimation of multiple related biological networks

    Full text link
    Graphical models are widely used to make inferences concerning interplay in multivariate systems. In many applications, data are collected from multiple related but nonidentical units whose underlying networks may differ but are likely to share features. Here we present a hierarchical Bayesian formulation for joint estimation of multiple networks in this nonidentically distributed setting. The approach is general: given a suitable class of graphical models, it uses an exchangeability assumption on networks to provide a corresponding joint formulation. Motivated by emerging experimental designs in molecular biology, we focus on time-course data with interventions, using dynamic Bayesian networks as the graphical models. We introduce a computationally efficient, deterministic algorithm for exact joint inference in this setting. We provide an upper bound on the gains that joint estimation offers relative to separate estimation for each network and empirical results that support and extend the theory, including an extensive simulation study and an application to proteomic data from human cancer cell lines. Finally, we describe approximations that are still more computationally efficient than the exact algorithm and that also demonstrate good empirical performance.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS761 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Uncovering distinct protein-network topologies in heterogeneous cell populations

    Get PDF
    Background: Cell biology research is fundamentally limited by the number of intracellular components, particularly proteins, that can be co-measured in the same cell. Therefore, cell-to-cell heterogeneity in unmeasured proteins can lead to completely different observed relations between the same measured proteins. Attempts to infer such relations in a heterogeneous cell population can yield uninformative average relations if only one underlying biochemical network is assumed. To address this, we developed a method that recursively couples an iterative unmixing process with a Bayesian analysis of each unmixed subpopulation. Results: Our approach enables to identify the number of distinct cell subpopulations, unmix their corresponding observations and resolve the network structure of each subpopulation. Using simulations of the MAPK pathway upon EGF and NGF stimulations we assess the performance of the method. We demonstrate that the presented method can identify better than clustering approaches the number of subpopulations within a mixture of observations, thus resolving correctly the statistical relations between the proteins. Conclusions: Coupling the unmixing of multiplexed observations with the inference of statistical relations between the measured parameters is essential for the success of both of these processes. Here we present a conceptual and algorithmic solution to achieve such coupling and hence to analyze data obtained from a natural mixture of cell populations. As the technologies and necessity for multiplexed measurements are rising in the systems biology era, this work addresses an important current challenge in the analysis of the derived data.Fil: Wieczorek, Jakob. Universitat Dortmund; AlemaniaFil: Malik Sheriff, Rahuman S.. Institut Max Planck fur Molekulare Physiologie; Alemania. Imperial College London; Reino Unido. European Bioinformatics Institute. European Molecular Biology Laboratory; Reino UnidoFil: Fermin, Yessica. Universitat Dortmund; AlemaniaFil: Grecco, Hernan Edgardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; Argentina. Institut Max Planck fur Molekulare Physiologie; AlemaniaFil: Zamir, Eli. Institut Max Planck fur Molekulare Physiologie; AlemaniaFil: Ickstadt, Katja. Universitat Dortmund; Alemani

    Joint Structure Learning of Multiple Non-Exchangeable Networks

    Full text link
    Several methods have recently been developed for joint structure learning of multiple (related) graphical models or networks. These methods treat individual networks as exchangeable, such that each pair of networks are equally encouraged to have similar structures. However, in many practical applications, exchangeability in this sense may not hold, as some pairs of networks may be more closely related than others, for example due to group and sub-group structure in the data. Here we present a novel Bayesian formulation that generalises joint structure learning beyond the exchangeable case. In addition to a general framework for joint learning, we (i) provide a novel default prior over the joint structure space that requires no user input; (ii) allow for latent networks; (iii) give an efficient, exact algorithm for the case of time series data and dynamic Bayesian networks. We present empirical results on non-exchangeable populations, including a real data example from biology, where cell-line-specific networks are related according to genomic features.Comment: To appear in Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS

    Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian hierarchical approach

    Full text link
    Cellular response to a perturbation is the result of a dynamic system of biological variables linked in a complex network. A major challenge in drug and disease studies is identifying the key factors of a biological network that are essential in determining the cell's fate. Here our goal is the identification of perturbed pathways from high-throughput gene expression data. We develop a three-level hierarchical model, where (i) the first level captures the relationship between gene expression and biological pathways using confirmatory factor analysis, (ii) the second level models the behavior within an underlying network of pathways induced by an unknown perturbation using a conditional autoregressive model, and (iii) the third level is a spike-and-slab prior on the perturbations. We then identify perturbations through posterior-based variable selection. We illustrate our approach using gene transcription drug perturbation profiles from the DREAM7 drug sensitivity predication challenge data set. Our proposed method identified regulatory pathways that are known to play a causative role and that were not readily resolved using gene set enrichment analysis or exploratory factor models. Simulation results are presented assessing the performance of this model relative to a network-free variant and its robustness to inaccuracies in biological databases

    Receptor Tyrosine Kinases Fall into Distinct Classes Based on Their Inferred Signaling Networks

    Get PDF
    Although many anticancer drugs that target receptor tyrosine kinases (RTKs) provide clinical benefit, their long-term use is limited by resistance that is often attributed to increased abundance or activation of another RTK that compensates for the inhibited receptor. To uncover common and unique features in the signaling networks of RTKs, we measured time-dependent signaling in six isogenic cell lines, each expressing a different RTK as downstream proteins were systematically perturbed by RNA interference. Network models inferred from the data revealed a conserved set of signaling pathways and RTK-specific features that grouped the RTKs into three distinct classes: (i) an EGFR/FGFR1/c-Met class constituting epidermal growth factor receptor, fibroblast growth factor receptor 1, and the hepatocyte growth factor receptor c-Met; (ii) an IGF-1R/NTRK2 class constituting insulin-like growth factor 1 receptor and neurotrophic tyrosine receptor kinase 2; and (iii) a PDGFRβ class constituting platelet-derived growth factor receptor β. Analysis of cancer cell line data showed that many RTKs of the same class were coexpressed and that increased abundance of an RTK or its cognate ligand frequently correlated with resistance to a drug targeting another RTK of the same class. In contrast, abundance of an RTK or ligand of one class generally did not affect sensitivity to a drug targeting an RTK of a different class. Thus, classifying RTKs by their inferred networks and then therapeutically targeting multiple receptors within a class may delay or prevent the onset of resistance.W. M. Keck FoundationNational Institutes of Health (U.S.) (R21 CA126720)National Institutes of Health (U.S.) (P50 GM068762)National Institutes of Health (U.S.) (RC1 HG005354)National Institutes of Health (U.S.) (U54-CA112967)National Institutes of Health (U.S.) (R01-CA096504)Alfred and Isabel Bader (Fellowship)Jacques-Emile Dubois (fellowship

    Causal network inference using biochemical kinetics

    Get PDF
    Motivation: Networks are widely used as structural summaries of biochemical systems. Statistical estimation of networks is usually based on linear or discrete models. However, the dynamics of biochemical systems are generally non-linear, suggesting that suitable non-linear formulations may offer gains with respect to causal network inference and aid in associated prediction problems. Results: We present a general framework for network inference and dynamical prediction using time course data that is rooted in nonlinear biochemical kinetics. This is achieved by considering a dynamical system based on a chemical reaction graph with associated kinetic parameters. Both the graph and kinetic parameters are treated as unknown; inference is carried out within a Bayesian framework. This allows prediction of dynamical behavior even when the underlying reaction graph itself is unknown or uncertain. Results, based on (i) data simulated from a mechanistic model of mitogen-activated protein kinase signaling and (ii) phosphoproteomic data from cancer cell lines, demonstrate that non-linear formulations can yield gains in causal network inference and permit dynamical prediction and uncertainty quantification in the challenging setting where the reaction graph is unknown. © The Author 2014. Published by Oxford University Press

    Exact reconstruction of gene regulatory networks using compressive sensing.

    Get PDF
    BackgroundWe consider the problem of reconstructing a gene regulatory network structure from limited time series gene expression data, without any a priori knowledge of connectivity. We assume that the network is sparse, meaning the connectivity among genes is much less than full connectivity. We develop a method for network reconstruction based on compressive sensing, which takes advantage of the network's sparseness.ResultsFor the case in which all genes are accessible for measurement, and there is no measurement noise, we show that our method can be used to exactly reconstruct the network. For the more general problem, in which hidden genes exist and all measurements are contaminated by noise, we show that our method leads to reliable reconstruction. In both cases, coherence of the model is used to assess the ability to reconstruct the network and to design new experiments. We demonstrate that it is possible to use the coherence distribution to guide biological experiment design effectively. By collecting a more informative dataset, the proposed method helps reduce the cost of experiments. For each problem, a set of numerical examples is presented.ConclusionsThe method provides a guarantee on how well the inferred graph structure represents the underlying system, reveals deficiencies in the data and model, and suggests experimental directions to remedy the deficiencies

    An empirical Bayesian approach for model-based inference of cellular signaling networks

    Get PDF
    Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF) signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements
    • …
    corecore