11,372 research outputs found

    Adaptive evolution of transcription factor binding sites

    Get PDF
    The regulation of a gene depends on the binding of transcription factors to specific sites located in the regulatory region of the gene. The generation of these binding sites and of cooperativity between them are essential building blocks in the evolution of complex regulatory networks. We study a theoretical model for the sequence evolution of binding sites by point mutations. The approach is based on biophysical models for the binding of transcription factors to DNA. Hence we derive empirically grounded fitness landscapes, which enter a population genetics model including mutations, genetic drift, and selection. We show that the selection for factor binding generically leads to specific correlations between nucleotide frequencies at different positions of a binding site. We demonstrate the possibility of rapid adaptive evolution generating a new binding site for a given transcription factor by point mutations. The evolutionary time required is estimated in terms of the neutral (background) mutation rate, the selection coefficient, and the effective population size. The efficiency of binding site formation is seen to depend on two joint conditions: the binding site motif must be short enough and the promoter region must be long enough. These constraints on promoter architecture are indeed seen in eukaryotic systems. Furthermore, we analyse the adaptive evolution of genetic switches and of signal integration through binding cooperativity between different sites. Experimental tests of this picture involving the statistics of polymorphisms and phylogenies of sites are discussed.Comment: published versio

    The genetic basis for adaptation of model-designed syntrophic co-cultures.

    Get PDF
    Understanding the fundamental characteristics of microbial communities could have far reaching implications for human health and applied biotechnology. Despite this, much is still unknown regarding the genetic basis and evolutionary strategies underlying the formation of viable synthetic communities. By pairing auxotrophic mutants in co-culture, it has been demonstrated that viable nascent E. coli communities can be established where the mutant strains are metabolically coupled. A novel algorithm, OptAux, was constructed to design 61 unique multi-knockout E. coli auxotrophic strains that require significant metabolite uptake to grow. These predicted knockouts included a diverse set of novel non-specific auxotrophs that result from inhibition of major biosynthetic subsystems. Three OptAux predicted non-specific auxotrophic strains-with diverse metabolic deficiencies-were co-cultured with an L-histidine auxotroph and optimized via adaptive laboratory evolution (ALE). Time-course sequencing revealed the genetic changes employed by each strain to achieve higher community growth rates and provided insight into mechanisms for adapting to the syntrophic niche. A community model of metabolism and gene expression was utilized to predict the relative community composition and fundamental characteristics of the evolved communities. This work presents new insight into the genetic strategies underlying viable nascent community formation and a cutting-edge computational method to elucidate metabolic changes that empower the creation of cooperative communities

    An evolutionary behavioral model for decision making

    Get PDF
    For autonomous agents the problem of deciding what to do next becomes increasingly complex when acting in unpredictable and dynamic environments pursuing multiple and possibly conflicting goals. One of the most relevant behavior-based model that tries to deal with this problem is the one proposed by Maes, the Bbehavior Network model. This model proposes a set of behaviors as purposive perception-action units which are linked in a nonhierarchical network, and whose behavior selection process is orchestrated by spreading activation dynamics. In spite of being an adaptive model (in the sense of self-regulating its own behavior selection process), and despite the fact that several extensions have been proposed in order to improve the original model adaptability, there is not a robust model yet that can self-modify adaptively both the topological structure and the functional purpose\ud of the network as a result of the interaction between the agent and its environment. Thus, this work proffers an innovative hybrid model driven by gene expression programming, which makes two main contributions: (1) given an initial set of meaningless and unconnected units, the evolutionary mechanism is able to build well-defined and robust behavior networks which are adapted and specialized to concrete internal agent's needs and goals; and (2)\ud the same evolutionary mechanism is able to assemble quite\ud complex structures such as deliberative plans (which operate in the long-term) and problem-solving strategies

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Knowledge Rich Natural Language Queries over Structured Biological Databases

    Full text link
    Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made

    Evolutionary Algorithms

    Full text link
    Evolutionary algorithms (EAs) are population-based metaheuristics, originally inspired by aspects of natural evolution. Modern varieties incorporate a broad mixture of search mechanisms, and tend to blend inspiration from nature with pragmatic engineering concerns; however, all EAs essentially operate by maintaining a population of potential solutions and in some way artificially 'evolving' that population over time. Particularly well-known categories of EAs include genetic algorithms (GAs), Genetic Programming (GP), and Evolution Strategies (ES). EAs have proven very successful in practical applications, particularly those requiring solutions to combinatorial problems. EAs are highly flexible and can be configured to address any optimization task, without the requirements for reformulation and/or simplification that would be needed for other techniques. However, this flexibility goes hand in hand with a cost: the tailoring of an EA's configuration and parameters, so as to provide robust performance for a given class of tasks, is often a complex and time-consuming process. This tailoring process is one of the many ongoing research areas associated with EAs.Comment: To appear in R. Marti, P. Pardalos, and M. Resende, eds., Handbook of Heuristics, Springe
    corecore