    Speeding up systems biology simulations of biochemical pathways using condor

    This is the accepted version of the following article: Speeding up Systems Biology Simulations of Biochemical Pathways using Condor". Concurrency and Computation: Practice and Experience Volume 26, Issue 17, pages 2727–2742, 10 December 2014 which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/cpe.3161/abstractSystems biology is a scientific field that uses computational modelling to study biological and biochemical systems. The simulation and analysis of models of these systems typically explore behaviour over a wide range of parameter values; as such, they are usually characterised by the need for nontrivial amounts of computing power. Grid computing provides access to such computational resources. In previous research, we created the grid-enabled biochemical networks simulation environment to attempt to speed up system biology simulations over a grid (the UK National Grid Service and ScotGrid). Following on from this work, we have created the simulation modelling of the epidermal growth factor receptor microtubule-associated protein kinase pathway utility, a standalone simulation tool dedicated to the modelling and analysis of the epidermal growth factor receptor microtubule-associated protein kinase pathway. This builds on experiences from biochemical networks simulation environment by decoupling the simulation modelling elements from the Grid middleware. This new utility enables us to interface with different grid technologies. This paper therefore describes the new SIMAP utility and an empirical investigation of its performance when deployed over a desktop grid based on the high throughput computing middleware Condor. We present our results based on a case study with a model of the mammalian ErbB signalling pathway, a pathway strongly linked to cance

    Designing Predictive Mathematical Models for the Metabolic Pathways Associated with Polyhydroxybutyrate Synthesis in \u3ci\u3eEscherichia coli\u3c/i\u3e

    Polyhydroxybutyrate (PHB) is a polyhydroxyalkanoate that has been extensively studied as a potential biodegradable replacement for petrochemically derived plastics. The synthesis pathway of PHB is native to Ralstonia eutropha, but the genes for the PHB pathway have successfully been introduced into Escherichia coli through plasmids such as the pBHR68 plasmid. However, the production of PHB needs to be more cost-effective before it can be commercially produced. A mathematical model for PHB synthesis was developed to identify target genes that could be genetically engineered to increase PHB production. The major metabolic pathways included in the model were glycolysis, acetyl coenzyme A (acetyl-CoA) synthesis, tricarboxylic acid (TCA) cycle, glyoxylate bypass, and PHB synthesis. Each reaction in the selected metabolic pathways was modeled using the kinetic mechanism identified for the associated enzyme. The promoters and transcription factors for each enzyme were incorporated into the model. The model was validated through comparison with other published models and experimental PHB production data. The predictive model identified 16 enzymes as having no effect on PHB production, 5 enzymes with a slight effect on PHB production, and 9 enzymes with large effects on PHB production. Decreasing the substrate affinity of the enzyme citrate synthase resulted in the largest increase in PHB synthesis. The second largest increase was observed from lowering the substrate affinity of glyceraldehyde-3-phosphate dehydrogenase. The predictive model also indicated that increasing the activity of the lac promoter in the pBHR68 plasmid resulted in the largest increase in the rate of PHB production. The predictive model successfully identified two genes and one promoter as targets for genetic engineering to create an optimized strain of E. coli for PHB production. The substrate-binding sites for the genes gltA (citrate synthase) and gapA (glyceraldehyde-3-phosphate dehydrogenase) should be genetically engineered to be less effective at binding the substrates. The lac promoter in the pBHR68 plasmid should be genetically engineered to more closely match the consensus sequence for binding to RNA polymerase. The model predicts that an optimized strain of E. coli for PHB production could be achieved by genetically altering gltA, gapA, and the lac promoter

    Infobiotics : computer-aided synthetic systems biology

    Until very recently Systems Biology has, despite its stated goals, been too reductive in terms of the models being constructed and the methods used have been, on the one hand, unsuited for large scale adoption or integration of knowledge across scales, and on the other hand, too fragmented. The thesis of this dissertation is that better computational languages and seamlessly integrated tools are required by systems and synthetic biologists to enable them to meet the significant challenges involved in understanding life as it is, and by designing, modelling and manufacturing novel organisms, to understand life as it could be. We call this goal, where everything necessary to conduct model-driven investigations of cellular circuitry and emergent effects in populations of cells is available without significant context-switching, “one-pot” in silico synthetic systems biology in analogy to “one-pot” chemistry and “one-pot” biology. Our strategy is to increase the understandability and reusability of models and experiments, thereby avoiding unnecessary duplication of effort, with practical gains in the efficiency of delivering usable prototype models and systems. Key to this endeavour are graphical interfaces that assists novice users by hiding complexity of the underlying tools and limiting choices to only what is appropriate and useful, thus ensuring that the results of in silico experiments are consistent, comparable and reproducible. This dissertation describes the conception, software engineering and use of two novel software platforms for systems and synthetic biology: the Infobiotics Workbench for modelling, in silico experimentation and analysis of multi-cellular biological systems; and DNA Library Designer with the DNALD language for the compact programmatic specification of combinatorial DNA libraries, as the first stage of a DNA synthesis pipeline, enabling methodical exploration biological problem spaces. Infobiotics models are formalised as Lattice Population P systems, a novel framework for the specification of spatially-discrete and multi-compartmental rule-based models, imbued with a stochastic execution semantics. This framework was developed to meet the needs of real systems biology problems: hormone transport and signalling in the root of Arabidopsis thaliana, and quorum sensing in the pathogenic bacterium Pseudomonas aeruginosa. Our tools have also been used to prototype a novel synthetic biological system for pattern formation, that has been successfully implemented in vitro. Taken together these novel software platforms provide a complete toolchain, from design to wet-lab implementation, of synthetic biological circuits, enabling a step change in the scale of biological investigations that is orders of magnitude greater than could previously be performed in one in silico “pot”

    Rule-based modeling of biochemical systems with BioNetGen

    Totowa, NJ. Please cite this article when referencing BioNetGen in future publications. Rule-based modeling involves the representation of molecules as structured objects and molecular interactions as rules for transforming the attributes of these objects. The approach is notable in that it allows one to systematically incorporate site-specific details about proteinprotein interactions into a model for the dynamics of a signal-transduction system, but the method has other applications as well, such as following the fates of individual carbon atoms in metabolic reactions. The consequences of protein-protein interactions are difficult to specify and track with a conventional modeling approach because of the large number of protein phosphoforms and protein complexes that these interactions potentially generate. Here, we focus on how a rule-based model is specified in the BioNetGen language (BNGL) and how a model specification is analyzed using the BioNetGen software tool. We also discuss new developments in rule-based modeling that should enable the construction and analyses of comprehensive models for signal transduction pathways and similarly large-scale models for other biochemical systems. Key Words: Computational systems biology; mathematical modeling; combinatorial complexity; software; formal languages; stochastic simulation; ordinary differential equations; protein-protein interactions; signal transduction; metabolic networks. 1

    Methods for construction and analysis of computational models in systems biology: applications to the modelling of the heat shock response and the self-assembly of intermediate filaments

    Systems biology is a new, emerging and rapidly developing, multidisciplinary research field that aims to study biochemical and biological systems from a holistic perspective, with the goal of providing a comprehensive, system- level understanding of cellular behaviour. In this way, it addresses one of the greatest challenges faced by contemporary biology, which is to compre- hend the function of complex biological systems. Systems biology combines various methods that originate from scientific disciplines such as molecu- lar biology, chemistry, engineering sciences, mathematics, computer science and systems theory. Systems biology, unlike “traditional” biology, focuses on high-level concepts such as: network, component, robustness, efficiency, control, regulation, hierarchical design, synchronization, concurrency, and many others. The very terminology of systems biology is “foreign” to “tra- ditional” biology, marks its drastic shift in the research paradigm and it indicates close linkage of systems biology to computer science. One of the basic tools utilized in systems biology is the mathematical modelling of life processes tightly linked to experimental practice. The stud- ies contained in this thesis revolve around a number of challenges commonly encountered in the computational modelling in systems biology. The re- search comprises of the development and application of a broad range of methods originating in the fields of computer science and mathematics for construction and analysis of computational models in systems biology. In particular, the performed research is setup in the context of two biolog- ical phenomena chosen as modelling case studies: 1) the eukaryotic heat shock response and 2) the in vitro self-assembly of intermediate filaments, one of the main constituents of the cytoskeleton. The range of presented approaches spans from heuristic, through numerical and statistical to ana- lytical methods applied in the effort to formally describe and analyse the two biological processes. We notice however, that although applied to cer- tain case studies, the presented methods are not limited to them and can be utilized in the analysis of other biological mechanisms as well as com- plex systems in general. The full range of developed and applied modelling techniques as well as model analysis methodologies constitutes a rich mod- elling framework. Moreover, the presentation of the developed methods, their application to the two case studies and the discussions concerning their potentials and limitations point to the difficulties and challenges one encounters in computational modelling of biological systems. The problems of model identifiability, model comparison, model refinement, model inte- gration and extension, choice of the proper modelling framework and level of abstraction, or the choice of the proper scope of the model run through this thesis