5,755 research outputs found

    The riddle of togelby

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.At the 2017 Artificial and Computational Intelligence in Games meeting at Dagstuhl, Julian Togelius asked how to make spaces where every way of filling in the details yielded a good game. This study examines the possibility of enriching search spaces so that they contain very high rates of interesting objects, specifically game elements. While we do not answer the full challenge of finding good games throughout the space, this study highlights a number of potential avenues. These include naturally rich spaces, a simple technique for modifying a representation to search only rich parts of a larger search space, and representations that are highly expressive and so exhibit highly restricted and consequently enriched search spaces. We treat the creation of plausible road systems, useful graphics, highly expressive room placement for maps, generation of cavern-like maps, and combinatorial puzzle spaces.Final Accepted Versio

    Using signal processing, evolutionary computation, and machine learning to identify transposable elements in genomes

    Get PDF
    About half of the human genome consists of transposable elements (TE's), sequences that have many copies of themselves distributed throughout the genome. All genomes, from bacterial to human, contain TE's. TE's affect genome function by either creating proteins directly or affecting genome regulation. They serve as molecular fossils, giving clues to the evolutionary history of the organism. TE's are often challenging to identify because they are fragmentary or heavily mutated. In this thesis, novel features for the detection and study of TE's are developed. These features are of two types. The first type are statistical features based on the Fourier transform used to assess reading frame use. These features measure how different the reading frame use is from that of a random sequence, which reading frames the sequence is using, and the proportion of use of the active reading frames. The second type of feature, called side effect machine (SEM) features, are generated by finite state machines augmented with counters that track the number of times the state is visited. These counters then become features of the sequence. The number of possible SEM features is super-exponential in the number of states. New methods for selecting useful feature subsets that incorporate a genetic algorithm and a novel clustering method are introduced. The features produced reveal structural characteristics of the sequences of potential interest to biologists. A detailed analysis of the genetic algorithm, its fitness functions, and its fitness landscapes is performed. The features are used, together with features used in existing exon finding algorithms, to build classifiers that distinguish TE's from other genomic sequences in humans, fruit flies, and ciliates. The classifiers achieve high accuracy (> 85%) on a variety of TE classification problems. The classifiers are used to scan large genomes for TE's. In addition, the features are used to describe the TE's in the newly sequenced ciliate, Tetrahymena thermophile to provide information for biologists useful to them in forming hypotheses to test experimentally concerning the role of these TE's and the mechanisms that govern them

    Digital Ecosystems: Ecosystem-Oriented Architectures

    Full text link
    We view Digital Ecosystems to be the digital counterparts of biological ecosystems. Here, we are concerned with the creation of these Digital Ecosystems, exploiting the self-organising properties of biological ecosystems to evolve high-level software applications. Therefore, we created the Digital Ecosystem, a novel optimisation technique inspired by biological ecosystems, where the optimisation works at two levels: a first optimisation, migration of agents which are distributed in a decentralised peer-to-peer network, operating continuously in time; this process feeds a second optimisation based on evolutionary computing that operates locally on single peers and is aimed at finding solutions to satisfy locally relevant constraints. The Digital Ecosystem was then measured experimentally through simulations, with measures originating from theoretical ecology, evaluating its likeness to biological ecosystems. This included its responsiveness to requests for applications from the user base, as a measure of the ecological succession (ecosystem maturity). Overall, we have advanced the understanding of Digital Ecosystems, creating Ecosystem-Oriented Architectures where the word ecosystem is more than just a metaphor.Comment: 39 pages, 26 figures, journa

    Machine Learning Guided Exploration of an Empirical Ribozyme Fitness Landscape

    Get PDF
    Okinawa Institute of Science and Technology Graduate UniversityDoctor of PhilosophyFitness landscape of a biomolecule is a representation of its activity as a function of its sequence. Properties of a fitness landscape determine how evolution proceeds. Therefore, the distribution of functional variants and more importantly, the connectivity of these variants within the sequence space are important scientific questions. Exploration of these spaces, however, is impeded by the combinatorial explosion of the sequence space. High-throughput experimental methods have recently reduced this impediment but only modestly. Better computational methods are needed to fully utilize the rich information from these experimental data to better understand the properties of the fitness landscape. In this work, I seek to improve this exploration process by combining data from massively parallel experimental assay with smart library design using advanced computational techniques. I focus on an artificial RNA enzyme or ribozyme that can catalyze a ligation reaction between two RNA fragments. This chemistry is analogous to that of the modern RNA polymeraseenzymes, therefore, represents an important reaction in the origin of life. In the first chapter, I discuss the background to this work in the context of evolutionary theory of fitness landscape and its implications in biotechnology. In chapter 2, I explore the use of processes borrowed from the field of evolutionary computation to solve optimization problems using real experimental sequence-activity data. In chapter 3, I investigate the use of supervised machine learning models to extract information on epistatic interactions from the dataset collected during multiple rounds of directed evolution. I investigate and experimentally validate the extent to which a deep learning model can be used to guide a completely computational evolutionary algorithm towards distant regions of the fitness landscape. In the final chapter, I perform a comprehensive experimental assay of the combinatorial region explored by the deep learning-guided evolutionary algorithm. Using this dataset, I analyze higher-order epistasis and attempt to explain the increased predictability of the region sampled by the algorithm. Finally, I provide the first experimental evidence of a large RNA ‘neutral network’. Altogether, this work represents the most comprehensive experimental and computational study of the RNA ligase ribozyme fitness landscape to date, providing important insights into the evolutionary search space possibly explored during the earliest stages of life.doctoral thesi

    Biological evolution through mutation, selection, and drift: An introductory review

    Full text link
    Motivated by present activities in (statistical) physics directed towards biological evolution, we review the interplay of three evolutionary forces: mutation, selection, and genetic drift. The review addresses itself to physicists and intends to bridge the gap between the biological and the physical literature. We first clarify the terminology and recapitulate the basic models of population genetics, which describe the evolution of the composition of a population under the joint action of the various evolutionary forces. Building on these foundations, we specify the ingredients explicitly, namely, the various mutation models and fitness landscapes. We then review recent developments concerning models of mutational degradation. These predict upper limits for the mutation rate above which mutation can no longer be controlled by selection, the most important phenomena being error thresholds, Muller's ratchet, and mutational meltdowns. Error thresholds are deterministic phenomena, whereas Muller's ratchet requires the stochastic component brought about by finite population size. Mutational meltdowns additionally rely on an explicit model of population dynamics, and describe the extinction of populations. Special emphasis is put on the mutual relationship between these phenomena. Finally, a few connections with the process of molecular evolution are established.Comment: 62 pages, 6 figures, many reference

    Evolution of new regulatory functions on biophysically realistic fitness landscapes

    Get PDF
    Regulatory networks consist of interacting molecules with a high degree of mutual chemical specificity. How can these molecules evolve when their function depends on maintenance of interactions with cognate partners and simultaneous avoidance of deleterious "crosstalk" with non-cognate molecules? Although physical models of molecular interactions provide a framework in which co-evolution of network components can be analyzed, most theoretical studies have focused on the evolution of individual alleles, neglecting the network. In contrast, we study the elementary step in the evolution of gene regulatory networks: duplication of a transcription factor followed by selection for TFs to specialize their inputs as well as the regulation of their downstream genes. We show how to coarse grain the complete, biophysically realistic genotype-phenotype map for this process into macroscopic functional outcomes and quantify the probability of attaining each. We determine which evolutionary and biophysical parameters bias evolutionary trajectories towards fast emergence of new functions and show that this can be greatly facilitated by the availability of "promiscuity-promoting" mutations that affect TF specificity

    A biophysical protein folding model accounts for most mutational fitness effects in viruses

    Full text link
    Fitness effects of mutations fall on a continuum ranging from lethal to deleterious to beneficial. The distribution of fitness effects (DFE) among random mutations is an essential component of every evolutionary model and a mathematical portrait of robustness. Recent experiments on five viral species all revealed a characteristic bimodal shaped DFE, featuring peaks at neutrality and lethality. However, the phenotypic causes underlying observed fitness effects are still unknown, and presumably thought to vary unpredictably from one mutation to another. By combining population genetics simulations with a simple biophysical protein folding model, we show that protein thermodynamic stability accounts for a large fraction of observed mutational effects. We assume that moderately destabilizing mutations inflict a fitness penalty proportional to the reduction in folded protein, which depends continuously on folding free energy (\Delta G). Most mutations in our model affect fitness by altering \Delta G, while, based on simple estimates, \approx10% abolish activity and are unconditionally lethal. Mutations pushing \Delta G>0 are also considered lethal. Contrary to neutral network theory, we find that, in mutation/selection/drift steady-state, high mutation rates (m) lead to less stable proteins and a more dispersed DFE, i.e. less mutational robustness. Small population size (N) also decreases stability and robustness. In our model, a continuum of non-lethal mutations reduces fitness by \approx2% on average, while \approx10-35% of mutations are lethal, depending on N and m. Compensatory mutations are common in small populations with high mutation rates. More broadly, we conclude that interplay between biophysical and population genetic forces shapes the DFE.Comment: Main text: 12 pages, 5 figures Supplementary Information: 10 pages, 5 figure

    The Complex Role of Sequence and Structure in the Stability and Function of the TIM Barrel Proteins

    Get PDF
    Sequence divergence of orthologous proteins enables adaptation to a plethora of environmental stresses and promotes evolution of novel functions. As one of the most common motifs in biology capable of diverse enzymatic functions, the TIM barrel represents an ideal model system for mapping the phenotypic manifestations of protein sequence. Limits on evolution imposed by constraints on sequence and structure were investigated using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Exploration of fitness landscapes of phylogenetically distant orthologs provides a strategy for elucidating the complex interrelationship in the context of a protein fold. Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape. These results suggest that fitness landscapes of point mutants can be successfully translocated in sequence space, where knowledge of one landscape may be predictive for the landscape of another ortholog. Analysis of a surprising class of beneficial mutations in all three IGPS orthologs pointed to a long-range allosteric pathway towards the active site of the protein. Biophysical and biochemical analyses provided insights into the molecular mechanism of these beneficial fitness effects. Epistatic interactions suggest that the helical shell may be involved in the observed allostery. Taken together, knowledge of the fundamental properties of the TIM protein architecture will provide new strategies for de novo protein design of a highly targeted protein fold

    Adaptive evolution of hybrid bacteria by horizontal gene transfer

    Full text link
    Horizontal gene transfer is an important factor in bacterial evolution that can act across species boundaries. Yet, we know little about rate and genomic targets of cross-lineage gene transfer, and about its effects on the recipient organism's physiology and fitness. Here, we address these questions in a parallel evolution experiment with two Bacillus subtilis lineages of 7% sequence divergence. We observe rapid evolution of hybrid organisms: gene transfer swaps ~12% of the core genome in just 200 generations, and 60% of core genes are replaced in at least one population. By genomics, transcriptomics, fitness assays, and statistical modeling, we show that transfer generates adaptive evolution and functional alterations in hybrids. Specifically, our experiments reveal a strong, repeatable fitness increase of evolved populations in the stationary growth phase. By genomic analysis of the transfer statistics across replicate populations, we infer that selection on HGT has a broad genetic basis: 40% of the observed transfers are adaptive. At the level of functional gene networks, we find signatures of negative and positive selection, consistent with hybrid incompatibilities and adaptive evolution of network functions. Our results suggest that gene transfer navigates a complex cross-lineage fitness landscape, bridging epistatic barriers along multiple high-fitness paths.Comment: The first three authors are joint first authors. Corresponding authors are Lassig and Maie
    corecore