987 research outputs found

    Evolution of Metabolic Networks: A Computational Framework

    Get PDF
    Background: The metabolic architectures of extant organisms share many key pathways such as the citric acid cycle, glycolysis, or the biosynthesis of most amino acids. Several competing hypotheses for the evolutionary mechanisms that shape metabolic networks have been discussed in the literature, each of which finds support from comparative analysis of extant genomes. Alternatively, the principles of metabolic evolution can be studied by direct computer simulation. This requires, however, an explicit implementation of all pertinent components: a universe of chemical reaction upon which the metabolism is built, an explicit representation of the enzymes that implement the metabolism, of a genetic system that encodes these enzymes, and of a fitness function that can be selected for. Results: We describe here a simulation environment that implements all these components in a simplified ways so that large-scale evolutionary studies are feasible. We employ an artificial chemistry that views chemical reactions as graph rewriting operations and utilizes a toy-version of quantum chemistry to derive thermodynamic parameters. Minimalist organisms with simple string-encoded genomes produce model ribozymes whose catalytic activity is determined by an ad hoc mapping between their secondary structure and the transition state graphs that they stabilize. Fitness is computed utilizing the ideas of metabolic flux analysis. We present an implementation of the complete system and first simulation results. Conclusions: The simulation system presented here allows coherent investigations into the evolutionary mechanisms of the first steps of metabolic evolution using a self-consistent toy univers

    Search-based Software Testing Driven by Automatically Generated and Manually Defined Fitness Functions

    Full text link
    Search-based software testing (SBST) typically relies on fitness functions to guide the search exploration toward software failures. There are two main techniques to define fitness functions: (a) automated fitness function computation from the specification of the system requirements and (b) manual fitness function design. Both techniques have advantages. The former uses information from the system requirements to guide the search toward portions of the input domain that are more likely to contain failures. The latter uses the engineers' domain knowledge. We propose ATheNA, a novel SBST framework that combines fitness functions that are automatically generated from requirements specifications and manually defined by engineers. We design and implement ATheNA-S, an instance of ATheNA that targets Simulink models. We evaluate ATheNA-S by considering a large set of models and requirements from different domains. We compare our solution with an SBST baseline tool that supports automatically generated fitness functions, and another one that supports manually defined fitness functions. Our results show that ATheNA-S generates more failure-revealing test cases than the baseline tools and that the difference between the performance of ATheNA-S and the baseline tools is not statistically significant. We also assess whether ATheNA-S could generate failure-revealing test cases when applied to a large case study from the automotive domain. Our results show that ATheNA-S successfully revealed a requirement violation in our case study

    Computational Studies on the Evolution of Metabolism

    Get PDF
    Living organisms throughout evolution have developed desired properties, such as the ability of maintaining functionality despite changes in the environment or their inner structure, the formation of functional modules, from metabolic pathways to organs, and most essentially the capacity to adapt and evolve in a process called natural selection. It can be observed in the metabolic networks of modern organisms that many key pathways such as the citric acid cycle, glycolysis, or the biosynthesis of most amino acids are common to all of them. Understanding the evolutionary mechanisms behind this development of complex biological systems is an intriguing and important task of current research in biology as well as artificial life. Several competing hypotheses for the formation of metabolic pathways and the mecha- nisms that shape metabolic networks have been discussed in the literature, each of which finds support from comparative analysis of extant genomes. However, while being powerful tools for the investigation of metabolic evolution, these traditional methods do not allow to look back in evolution far enough to the time when metabolism had to emerge and evolve to the form we can observe today. To this end, simulation studies have been introduced to discover the principles of metabolic evolution and the sources for the emergence of metabolism prop- erties. These approaches differ considerably in the realism and explicitness of the underlying models. A difficult trade-off between realism and computational feasibility has to be made and further modeling decisions on many scales have to be taken into account, requiring the combination of knowledge from different fields such as chemistry, physics, biology and last but not least also computer science. In this thesis, a novel computational model for the in silico evolution of early metabolism is introduced. It comprises all the components on different scales to resemble a situation of evolving metabolic protocells in an RNA-world. Therefore, the model contains a minimal RNA-based genetics and an evolving metabolism of catalytic ribozymes that manipulate a rich underlying chemistry. To allow the metabolic organization to escape from the confines of the chemical space set by the initial conditions of the simulation and in general an open- ended evolution, an evolvable sequence-to-function map is used. At the heart of the metabolic subsystem is a graph-based artificial chemistry equipped with a built-in thermodynamics. The generation of the metabolic reaction network is realized as a rule-based stochastic simulation. The necessary reaction rates are calculated from the chemical graphs of the reactants on the fly. The selection procedure among the population of protocells is based on the optimal metabolic yield of the protocells, which is computed using flux balance analysis. The introduced computational model allows for profound investigations of the evolution of early metabolism and the underlying evolutionary mechanisms. One application in this thesis is the study of the formation of metabolic pathways. Therefore, four established hypothe- ses, namely the backwards evolution, forward evolution, patchwork evolution and the shell hypothesis, are discussed within the realms of this in silico evolution study. The metabolic pathways of the networks, evolved in various simulation runs, are determined and analyzed in terms of their evolutionary direction. The simulation results suggest that the seemingly mutually exclusive hypotheses may well be compatible when considering that different pro- cesses dominate different phases in the evolution of a metabolic system. Further, it is found that forward evolution shapes the metabolic network in the very early steps of evolution. In later and more complex stages, enzyme recruitment supersedes forward evolution, keeping a core set of pathways from the early phase. Backward evolution can only be observed under conditions of steady environmental change. Additionally, evolutionary history of enzymes and metabolites were studied on the network level as well as for single instances, showing a great variety of evolutionary mechanisms at work. The second major focus of the in silico evolutionary study is the emergence of complex system properties, such as robustness and modularity. To this end several techniques to analyze the metabolic systems were used. The measures for complex properties stem from the fields of graph theory, steady state analysis and neutral network theory. Some are used in general network analysis and others were developed specifically for the purpose introduced in this work. To discover potential sources for the emergence of system properties, three different evolutionary scenarios were tested and compared. The first two scenarios are the same as for the first part of the investigation, one scenario of evolution under static conditions and one incorporating a steady change in the set of ”food” molecules. A third scenario was added that also simulates a static evolution but with an increased mutation rate and regular events of horizontal gene transfer between protocells of the population. The comparison of all three scenarios with real world metabolic networks shows a significant similarity in structure and properties. Among the three scenarios, the two static evolutions yield the most robust metabolic networks, however, the networks evolved under environmental change exhibit their own strategy to a robustness more suited to their conditions. As expected from theory, horizontal gene transfer and changes in the environment seem to produce higher degrees of modularity in metabolism. Both scenarios develop rather different kinds of modularity, while horizontal gene transfer provides for more isolated modules, the modules of the second scenario are far more interconnected

    Computational Studies on the Evolution of Metabolism

    Get PDF
    Living organisms throughout evolution have developed desired properties, such as the ability of maintaining functionality despite changes in the environment or their inner structure, the formation of functional modules, from metabolic pathways to organs, and most essentially the capacity to adapt and evolve in a process called natural selection. It can be observed in the metabolic networks of modern organisms that many key pathways such as the citric acid cycle, glycolysis, or the biosynthesis of most amino acids are common to all of them. Understanding the evolutionary mechanisms behind this development of complex biological systems is an intriguing and important task of current research in biology as well as artificial life. Several competing hypotheses for the formation of metabolic pathways and the mecha- nisms that shape metabolic networks have been discussed in the literature, each of which finds support from comparative analysis of extant genomes. However, while being powerful tools for the investigation of metabolic evolution, these traditional methods do not allow to look back in evolution far enough to the time when metabolism had to emerge and evolve to the form we can observe today. To this end, simulation studies have been introduced to discover the principles of metabolic evolution and the sources for the emergence of metabolism prop- erties. These approaches differ considerably in the realism and explicitness of the underlying models. A difficult trade-off between realism and computational feasibility has to be made and further modeling decisions on many scales have to be taken into account, requiring the combination of knowledge from different fields such as chemistry, physics, biology and last but not least also computer science. In this thesis, a novel computational model for the in silico evolution of early metabolism is introduced. It comprises all the components on different scales to resemble a situation of evolving metabolic protocells in an RNA-world. Therefore, the model contains a minimal RNA-based genetics and an evolving metabolism of catalytic ribozymes that manipulate a rich underlying chemistry. To allow the metabolic organization to escape from the confines of the chemical space set by the initial conditions of the simulation and in general an open- ended evolution, an evolvable sequence-to-function map is used. At the heart of the metabolic subsystem is a graph-based artificial chemistry equipped with a built-in thermodynamics. The generation of the metabolic reaction network is realized as a rule-based stochastic simulation. The necessary reaction rates are calculated from the chemical graphs of the reactants on the fly. The selection procedure among the population of protocells is based on the optimal metabolic yield of the protocells, which is computed using flux balance analysis. The introduced computational model allows for profound investigations of the evolution of early metabolism and the underlying evolutionary mechanisms. One application in this thesis is the study of the formation of metabolic pathways. Therefore, four established hypothe- ses, namely the backwards evolution, forward evolution, patchwork evolution and the shell hypothesis, are discussed within the realms of this in silico evolution study. The metabolic pathways of the networks, evolved in various simulation runs, are determined and analyzed in terms of their evolutionary direction. The simulation results suggest that the seemingly mutually exclusive hypotheses may well be compatible when considering that different pro- cesses dominate different phases in the evolution of a metabolic system. Further, it is found that forward evolution shapes the metabolic network in the very early steps of evolution. In later and more complex stages, enzyme recruitment supersedes forward evolution, keeping a core set of pathways from the early phase. Backward evolution can only be observed under conditions of steady environmental change. Additionally, evolutionary history of enzymes and metabolites were studied on the network level as well as for single instances, showing a great variety of evolutionary mechanisms at work. The second major focus of the in silico evolutionary study is the emergence of complex system properties, such as robustness and modularity. To this end several techniques to analyze the metabolic systems were used. The measures for complex properties stem from the fields of graph theory, steady state analysis and neutral network theory. Some are used in general network analysis and others were developed specifically for the purpose introduced in this work. To discover potential sources for the emergence of system properties, three different evolutionary scenarios were tested and compared. The first two scenarios are the same as for the first part of the investigation, one scenario of evolution under static conditions and one incorporating a steady change in the set of ”food” molecules. A third scenario was added that also simulates a static evolution but with an increased mutation rate and regular events of horizontal gene transfer between protocells of the population. The comparison of all three scenarios with real world metabolic networks shows a significant similarity in structure and properties. Among the three scenarios, the two static evolutions yield the most robust metabolic networks, however, the networks evolved under environmental change exhibit their own strategy to a robustness more suited to their conditions. As expected from theory, horizontal gene transfer and changes in the environment seem to produce higher degrees of modularity in metabolism. Both scenarios develop rather different kinds of modularity, while horizontal gene transfer provides for more isolated modules, the modules of the second scenario are far more interconnected

    Explicit Building Block Multiobjective Evolutionary Computation: Methods and Applications

    Get PDF
    This dissertation presents principles, techniques, and performance of evolutionary computation optimization methods. Concentration is on concepts, design formulation, and prescription for multiobjective problem solving and explicit building block (BB) multiobjective evolutionary algorithms (MOEAs). Current state-of-the-art explicit BB MOEAs are addressed in the innovative design, execution, and testing of a new multiobjective explicit BB MOEA. Evolutionary computation concepts examined are algorithm convergence, population diversity and sizing, genotype and phenotype partitioning, archiving, BB concepts, parallel evolutionary algorithm (EA) models, robustness, visualization of evolutionary process, and performance in terms of effectiveness and efficiency. The main result of this research is the development of a more robust algorithm where MOEA concepts are implicitly employed. Testing shows that the new MOEA can be more effective and efficient than previous state-of-the-art explicit BB MOEAs for selected test suite multiobjective optimization problems (MOPs) and U.S. Air Force applications. Other contributions include the extension of explicit BB definitions to clarify the meanings for good single and multiobjective BBs. A new visualization technique is developed for viewing genotype, phenotype, and the evolutionary process in finding Pareto front vectors while tracking the size of the BBs. The visualization technique is the result of a BB tracing mechanism integrated into the new MOEA that enables one to determine the required BB sizes and assign an approximation epistasis level for solving a particular problem. The culmination of this research is explicit BB state-of-the-art MOEA technology based on the MOEA design, BB classifier type assessment, solution evolution visualization, and insight into MOEA test metric validation and usage as applied to test suite, deception, bioinformatics, unmanned vehicle flight pattern, and digital symbol set design MOPs

    Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers

    Get PDF
    CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities. The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers. A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI

    Engineering Novel Rhodopsins for Neuroscience

    Get PDF
    The overarching goal of my PhD research has been engineering proteins capable of controlling and reading out neural activity to advance neuroscience research. I engineered light-gated microbial rhodopsins, primarily focusing on the algal derived, light-gated channel, channelrhodopsin (ChR), which can be used to modulate neuronal activity with light. This work has required overcoming three major challenges. First, rhodopsins are trans-membrane proteins, which are inherently difficult to engineer because the sequence and structural determinants of membrane protein expression and plasma membrane localization are highly constrained and poorly understood (Chapter 3-5). Second, protein properties of interest for neuroscience applications are assayed using very low throughput patch-clamp electrophysiology preventing the use of high-throughput assays required for directed evolution experiments (Chapter 2, 5-6). And third, in vivo application of these improved tools require either retention or optimization of multiple protein properties in a single protein tool; for example, we must optimize expression and localization of these algal membrane proteins in mammalian cells while at the same time optimizing kinetic and functional properties (Chapter 5-6). These challenges restricted the field to low-throughput, conservative methods for discovery of improved ChRs, e.g., structure-guided mutagenesis and testing of natural ChR variants. I used an alternative approach: data-driven machine learning to model the fitness landscape of ChRs for different properties of interest and applying these models to select ChR sequences with optimal combinations of properties (Chapters 5-6). ChR variants identified from this work have unprecedented conductance properties and light sensitivity that could enable non-invasive activation of populations of cells throughout the nervous system. These ChRs have the potential to change how optogenetics experiments are done. This work is a convincing demonstration of the power of machine learning guided protein engineering for a class of proteins that present multiple engineering challenges. A component of the novel application of these new ChR tools relies on recent advances in gene delivery throughout the nervous system facilitated by engineered AAVs (Chapter 7). And finally, I developed a behavioral tracking system to monitor behavior and demonstrate sleep behavior in the jellyfish Cassiopea, the most primitive organism to have this behavior formally characterized (Chapter 8).</p

    Synthesis of Biological and Mathematical Methods for Gene Network Control

    Get PDF
    abstract: Synthetic biology is an emerging field which melds genetics, molecular biology, network theory, and mathematical systems to understand, build, and predict gene network behavior. As an engineering discipline, developing a mathematical understanding of the genetic circuits being studied is of fundamental importance. In this dissertation, mathematical concepts for understanding, predicting, and controlling gene transcriptional networks are presented and applied to two synthetic gene network contexts. First, this engineering approach is used to improve the function of the guide ribonucleic acid (gRNA)-targeted, dCas9-regulated transcriptional cascades through analysis and targeted modification of the RNA transcript. In so doing, a fluorescent guide RNA (fgRNA) is developed to more clearly observe gRNA dynamics and aid design. It is shown that through careful optimization, RNA Polymerase II (Pol II) driven gRNA transcripts can be strong enough to exhibit measurable cascading behavior, previously only shown in RNA Polymerase III (Pol III) circuits. Second, inherent gene expression noise is used to achieve precise fractional differentiation of a population. Mathematical methods are employed to predict and understand the observed behavior, and metrics for analyzing and quantifying similar differentiation kinetics are presented. Through careful mathematical analysis and simulation, coupled with experimental data, two methods for achieving ratio control are presented, with the optimal schema for any application being dependent on the noisiness of the system under study. Together, these studies push the boundaries of gene network control, with potential applications in stem cell differentiation, therapeutics, and bio-production.Dissertation/ThesisDoctoral Dissertation Biomedical Engineering 201

    Utilising restricted for-loops in genetic programming

    Get PDF
    Genetic programming is an approach that utilises the power of evolution to allow computers to evolve programs. While loops are natural components of most programming languages and appear in every reasonably-sized application, they are rarely used in genetic programming. The work is to investigate a number of restricted looping constructs to determine whether any significant benefits can be obtained in genetic programming. Possible benefits include: Solving problems which cannot be solved without loops, evolving smaller sized solutions which can be more easily understood by human programmers and solving existing problems quicker by using fewer evaluations. In this thesis, a number of explicit restricted loop formats were formulated and tested on the Santa Fe ant problem, a modified ant problem, a sorting problem, a visit-every-square problem and a difficult object classificat ion problem. The experimental results showed that these explicit loops can be successfully used in genetic programming. The evolutionary process can decide when, where and how to use them. Runs with these loops tended to generate smaller sized solutions in fewer evaluations. Solutions with loops were found to some problems that could not be solved without loops. The results and analysis of this thesis have established that there are significant benefits in using loops in genetic programming. Restricted loops can avoid the difficulties of evolving consistent programs and the infinite iterations problem. Researchers and other users of genetic programming should not be afraid of loops

    Probabilistic Protein Engineering

    Get PDF
    Machine learning-guided protein engineering is a new paradigm that enables the optimization of complex protein functions. Machine-learning methods use data to predict protein function without requiring a detailed model of the underlying physics or biological pathways. They accelerate protein engineering by learning from information contained in all measured variants and using it to select variants that are likely to be improved. We begin with a review of the basics of machine learning with a focus on applications to protein engineering and protein sequence-function datasets (Chapter 1). We used the entire machine-learning guided engineering paradigm to engineer the algal-derived light-gated channel channelrhodopsin (ChR), which can be used to modulate neuronal activity with light. We build models that discover ChRs with strong plasma membrane localization in mammalian cells (Chapter 2) and unprecedented light sensitivity and photocurrents for optogenetic applications (Chapter 3). Machine learning-guided evolution requires a machine-learning model that learns the relationship between sequence and function. For machine-learning models to learn about protein sequences, protein sequences must be represented as vectors or matrices of numbers. How each protein sequence is represented determines what can be learned. We learn continuous vector encodings of sequences from patterns in unlabeled sequences (Chapter 4). Learned encodings are low-dimensional, do not require alignments, and may improve performance by transferring information in unlabeled sequences to specific prediction tasks. Alternately, we demonstrate an interpretable Gaussian process kernel tailored to biological sequences (Chapter 6). In addition to a model to predict function from sequence, engineering requires a method to use the model to choose sequences for the next round of evolution. Most machine-learning guided engineering strategies assume that selected sequences can be queried directly. However, in directed evolution it is common to design a library of sequences and then sample stochastic batches from that library. We propose a batched stochastic Bayesian optimization algorithm for iteratively designing and screening site-saturation mutagenesis libraries (Chapter 5).</p
    corecore