91 research outputs found

    Synthesising executable gene regulatory networks in haematopoiesis from single-cell gene expression data

    Get PDF
    A fundamental challenge in biology is to understand the complex gene regulatory networks which control tissue development in the mammalian embryo, and maintain homoeostasis in the adult. The cell fate decisions underlying these processes are ultimately made at the level of individual cells. Recent experimental advances in biology allow researchers to obtain gene expression profiles at single-cell resolution over thousands of cells at once. These single-cell measurements provide snapshots of the states of the cells that make up a tissue, instead of the population-level averages provided by conventional high-throughput experiments. The aim of this PhD was to investigate the possibility of using this new high resolution data to reconstruct mechanistic computational models of gene regulatory networks. In this thesis I introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space. I then give a scalable algorithm to solve this synthesis problem. In order to achieve scalability, this algorithm works in a modular way, treating different aspects of a graph data structure separately before encoding the search for logical rules as Boolean satisfiability problems to be dispatched to a SAT solver. Together with experimental collaborators, I applied this method to understanding the process of early blood development in the embryo, which is poorly understood due to the small number of cells present at this stage. The emergence of blood from Flk1+ mesoderm was studied by single cell expression analysis of 3934 cells at four sequential developmental time points. A mechanistic model recapitulating blood development was reconstructed from this data set, which was consistent with known biology and the bifurcation of blood and endothelium. Several model predictions were validated experimentally, demonstrating that HoxB4 and Sox17 directly regulate the haematopoietic factor Erg, and that Sox7 blocks primitive erythroid development. A general-purpose graphical tool was then developed based on this algorithm, which can be used by biological researchers as new single-cell data sets become available. This tool can deploy computations to the cloud in order to scale up larger high-throughput data sets. The results in this thesis demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the gene regulatory networks that underpin organogenesis. Rapid technological advances in our ability to perform single-cell profiling suggest that my tool will be applicable to other organ systems and may inform the development of improved cellular programming strategies.Microsoft Research PhD Scholarshi

    SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data.

    Get PDF
    Background Reconstruction of executable mechanistic models from single-cell gene expression data represents a powerful approach to understanding developmental and disease processes. New ambitious efforts like the Human Cell Atlas will soon lead to an explosion of data with potential for uncovering and understanding the regulatory networks which underlie the behaviour of all human cells. In order to take advantage of this data, however, there is a need for general-purpose, user-friendly and efficient computational tools that can be readily used by biologists who do not have specialist computer science knowledge. Results The Single Cell Network Synthesis toolkit (SCNS) is a general-purpose computational tool for the reconstruction and analysis of executable models from single-cell gene expression data. Through a graphical user interface, SCNS takes single-cell qPCR or RNA-sequencing data taken across a time course, and searches for logical rules that drive transitions from early cell states towards late cell states. Because the resulting reconstructed models are executable, they can be used to make predictions about the effect of specific gene perturbations on the generation of specific lineages. Conclusions SCNS should be of broad interest to the growing number of researchers working in single-cell genomics and will help further facilitate the generation of valuable mechanistic insights into developmental, homeostatic and disease processes.Research in the Gottgens lab is supported by infrastructure support funding from the Wellcome Trust to the Wellcome Trust and MRC Cambridge Stem Cell Institute. Steven Woodhouse is a postdoctoral researcher supported by Microsoft Researc

    Revealing the vectors of cellular identity with single-cell genomics

    Get PDF
    Single-cell genomics has now made it possible to create a comprehensive atlas of human cells. At the same time, it has reopened definitions of a cell's identity and of the ways in which identity is regulated by the cell's molecular circuitry. Emerging computational analysis methods, especially in single-cell RNA sequencing (scRNA-seq), have already begun to reveal, in a data-driven way, the diverse simultaneous facets of a cell's identity, from discrete cell types to continuous dynamic transitions and spatial locations. These developments will eventually allow a cell to be represented as a superposition of 'basis vectors', each determining a different (but possibly dependent) aspect of cellular organization and function. However, computational methods must also overcome considerable challenges-from handling technical noise and data scale to forming new abstractions of biology. As the scale of single-cell experiments continues to increase, new computational approaches will be essential for constructing and characterizing a reference map of cell identities.National Institutes of Health (U.S.) (grant P50 HG006193)BRAIN Initiative (grant U01 MH105979)National Institutes of Health (U.S.) (BRAIN grant 1U01MH105960-01)National Cancer Institute (U.S.) (grant 1U24CA180922)National Institute of Allergy and Infectious Diseases (U.S.) (grant 1U24AI118672-01

    Model checking the evolution of gene regulatory networks

    Get PDF
    The behaviour of gene regulatory networks (GRNs) is typically analysed using simulation-based statistical testing-like methods. In this paper, we demonstrate that we can replace this approach by a formal verification-like method that gives higher assurance and scalability. We focus on Wagner’s weighted GRN model with varying weights, which is used in evolutionary biology. In the model, weight parameters represent the gene interaction strength that may change due to genetic mutations. For a property of interest, we synthesise the constraints over the parameter space that represent the set of GRNs satisfying the property. We experimentally show that our parameter synthesis procedure computes the mutational robustness of GRNs—an important problem of interest in evolutionary biology—more efficiently than the classical simulation method. We specify the property in linear temporal logic. We employ symbolic bounded model checking and SMT solving to compute the space of GRNs that satisfy the property, which amounts to synthesizing a set of linear constraints on the weights

    Tools and techniques for multi-valued networks using rewriting logic

    Get PDF
    PhD ThesisMulti-valued networks (MVNs) are an important, widely used qualitative modelling technique where time and states are discrete. MVNs extend the well-known Boolean networks by providing a more powerful qualitative modelling approach for biological systems by allowing an entity’s state to be within a range of discrete set of values instead of just 0 and 1. They provide a logical framework for qualitatively modelling and analysing control systems and have been successfully applied to biological systems and circuit design. While a range of support tools for developing and analysing MVNs exist, more work is needed to develop tools to support the practical applications of those techniques. One of the frameworks that have been successfully applied to biological systems is Rewriting Logic (RL), an algebraic specification framework that is capable of modelling and analysing the behaviour of dynamic, concurrent systems. The flexibility of RL techniques such as implementation of strategies has allowed it to be successfully used to model a wide range of different formalisms and systems, such as process algebras, Petri nets, and biological systems. RL specification, programming and computation is supported by a range of powerful analysis tools which was one of the motivations for choosing to use RL. We choose Maude as a tool in our work here which is a high-performance reflective language supporting both equational and RL specification. Maude is going to be used through this thesis to model and analyse a range of MVNs using RL. In this thesis we aim to investigate the application of RL to modelling and analysing both synchronous and asynchronous MVNs, thus enabling the application of support tools available for RL. We start by constructing an RL model for MVNs using a translation approach that translates an MVNs set of equations into rewrite rules. We formally show that our translation approach is correct by proving its soundness and completeness. We illustrate the techniques and the developed RL framework for MVNs by presenting a range of case studies which provides a good illustration of the practical application of the developed RL framework. We then introduce an artificial, scalable MVN model in order to allow a range of model sizes to be considered and we investigate the performance of our RL framework. We analyse a larger regulatory network from the literature using our RL framework to give some insights into how it coped with a larger case studyMinistry of Higher Education in Saudi Arabi

    Infobiotics : computer-aided synthetic systems biology

    Get PDF
    Until very recently Systems Biology has, despite its stated goals, been too reductive in terms of the models being constructed and the methods used have been, on the one hand, unsuited for large scale adoption or integration of knowledge across scales, and on the other hand, too fragmented. The thesis of this dissertation is that better computational languages and seamlessly integrated tools are required by systems and synthetic biologists to enable them to meet the significant challenges involved in understanding life as it is, and by designing, modelling and manufacturing novel organisms, to understand life as it could be. We call this goal, where everything necessary to conduct model-driven investigations of cellular circuitry and emergent effects in populations of cells is available without significant context-switching, “one-pot” in silico synthetic systems biology in analogy to “one-pot” chemistry and “one-pot” biology. Our strategy is to increase the understandability and reusability of models and experiments, thereby avoiding unnecessary duplication of effort, with practical gains in the efficiency of delivering usable prototype models and systems. Key to this endeavour are graphical interfaces that assists novice users by hiding complexity of the underlying tools and limiting choices to only what is appropriate and useful, thus ensuring that the results of in silico experiments are consistent, comparable and reproducible. This dissertation describes the conception, software engineering and use of two novel software platforms for systems and synthetic biology: the Infobiotics Workbench for modelling, in silico experimentation and analysis of multi-cellular biological systems; and DNA Library Designer with the DNALD language for the compact programmatic specification of combinatorial DNA libraries, as the first stage of a DNA synthesis pipeline, enabling methodical exploration biological problem spaces. Infobiotics models are formalised as Lattice Population P systems, a novel framework for the specification of spatially-discrete and multi-compartmental rule-based models, imbued with a stochastic execution semantics. This framework was developed to meet the needs of real systems biology problems: hormone transport and signalling in the root of Arabidopsis thaliana, and quorum sensing in the pathogenic bacterium Pseudomonas aeruginosa. Our tools have also been used to prototype a novel synthetic biological system for pattern formation, that has been successfully implemented in vitro. Taken together these novel software platforms provide a complete toolchain, from design to wet-lab implementation, of synthetic biological circuits, enabling a step change in the scale of biological investigations that is orders of magnitude greater than could previously be performed in one in silico “pot”

    Infobiotics : computer-aided synthetic systems biology

    Get PDF
    Until very recently Systems Biology has, despite its stated goals, been too reductive in terms of the models being constructed and the methods used have been, on the one hand, unsuited for large scale adoption or integration of knowledge across scales, and on the other hand, too fragmented. The thesis of this dissertation is that better computational languages and seamlessly integrated tools are required by systems and synthetic biologists to enable them to meet the significant challenges involved in understanding life as it is, and by designing, modelling and manufacturing novel organisms, to understand life as it could be. We call this goal, where everything necessary to conduct model-driven investigations of cellular circuitry and emergent effects in populations of cells is available without significant context-switching, “one-pot” in silico synthetic systems biology in analogy to “one-pot” chemistry and “one-pot” biology. Our strategy is to increase the understandability and reusability of models and experiments, thereby avoiding unnecessary duplication of effort, with practical gains in the efficiency of delivering usable prototype models and systems. Key to this endeavour are graphical interfaces that assists novice users by hiding complexity of the underlying tools and limiting choices to only what is appropriate and useful, thus ensuring that the results of in silico experiments are consistent, comparable and reproducible. This dissertation describes the conception, software engineering and use of two novel software platforms for systems and synthetic biology: the Infobiotics Workbench for modelling, in silico experimentation and analysis of multi-cellular biological systems; and DNA Library Designer with the DNALD language for the compact programmatic specification of combinatorial DNA libraries, as the first stage of a DNA synthesis pipeline, enabling methodical exploration biological problem spaces. Infobiotics models are formalised as Lattice Population P systems, a novel framework for the specification of spatially-discrete and multi-compartmental rule-based models, imbued with a stochastic execution semantics. This framework was developed to meet the needs of real systems biology problems: hormone transport and signalling in the root of Arabidopsis thaliana, and quorum sensing in the pathogenic bacterium Pseudomonas aeruginosa. Our tools have also been used to prototype a novel synthetic biological system for pattern formation, that has been successfully implemented in vitro. Taken together these novel software platforms provide a complete toolchain, from design to wet-lab implementation, of synthetic biological circuits, enabling a step change in the scale of biological investigations that is orders of magnitude greater than could previously be performed in one in silico “pot”
    • …
    corecore