689 research outputs found

    Probabilistic Clustering of Sequences: Inferring new bacterial regulons by comparative genomics

    Full text link
    Genome wide comparisons between enteric bacteria yield large sets of conserved putative regulatory sites on a gene by gene basis that need to be clustered into regulons. Using the assumption that regulatory sites can be represented as samples from weight matrices we derive a unique probability distribution for assignments of sites into clusters. Our algorithm, 'PROCSE' (probabilistic clustering of sequences), uses Monte-Carlo sampling of this distribution to partition and align thousands of short DNA sequences into clusters. The algorithm internally determines the number of clusters from the data, and assigns significance to the resulting clusters. We place theoretical limits on the ability of any algorithm to correctly cluster sequences drawn from weight matrices (WMs) when these WMs are unknown. Our analysis suggests that the set of all putative sites for a single genome (e.g. E. coli) is largely inadequate for clustering. When sites from different genomes are combined and all the homologous sites from the various species are used as a block, clustering becomes feasible. We predict 50-100 new regulons as well as many new members of existing regulons, potentially doubling the number of known regulatory sites in E. coli.Comment: 27 pages including 9 figures and 3 table

    Coupled Replicator Equations for the Dynamics of Learning in Multiagent Systems

    Full text link
    Starting with a group of reinforcement-learning agents we derive coupled replicator equations that describe the dynamics of collective learning in multiagent systems. We show that, although agents model their environment in a self-interested way without sharing knowledge, a game dynamics emerges naturally through environment-mediated interactions. An application to rock-scissors-paper game interactions shows that the collective learning dynamics exhibits a diversity of competitive and cooperative behaviors. These include quasiperiodicity, stable limit cycles, intermittency, and deterministic chaos--behaviors that should be expected in heterogeneous multiagent systems described by the general replicator equations we derive.Comment: 4 pages, 3 figures, http://www.santafe.edu/projects/CompMech/papers/credlmas.html; updated references, corrected typos, changed conten

    Evolutionary games and quasispecies

    Full text link
    We discuss a population of sequences subject to mutations and frequency-dependent selection, where the fitness of a sequence depends on the composition of the entire population. This type of dynamics is crucial to understand the evolution of genomic regulation. Mathematically, it takes the form of a reaction-diffusion problem that is nonlinear in the population state. In our model system, the fitness is determined by a simple mathematical game, the hawk-dove game. The stationary population distribution is found to be a quasispecies with properties different from those which hold in fixed fitness landscapes.Comment: 7 pages, 2 figures. Typos corrected, references updated. An exact solution for the hawks-dove game is provide

    On the Neutrality of Flowshop Scheduling Fitness Landscapes

    Get PDF
    Solving efficiently complex problems using metaheuristics, and in particular local searches, requires incorporating knowledge about the problem to solve. In this paper, the permutation flowshop problem is studied. It is well known that in such problems, several solutions may have the same fitness value. As this neutrality property is an important one, it should be taken into account during the design of optimization methods. Then in the context of the permutation flowshop, a deep landscape analysis focused on the neutrality property is driven and propositions on the way to use this neutrality to guide efficiently the search are given.Comment: Learning and Intelligent OptimizatioN Conference (LION 5), Rome : Italy (2011

    Joint scaling laws in functional and evolutionary categories in prokaryotic genomes

    Get PDF
    We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional "recipe" for genome composition of the type "a spoonful of sugar for each egg yolk". The model jointly reproduces two known empirical laws: the distribution of family sizes and the nonlinear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterising these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterised by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes.Comment: 39 pages, 21 figure

    Co-Evolution of quasispecies: B-cell mutation rates maximize viral error catastrophes

    Full text link
    Co-evolution of two coupled quasispecies is studied, motivated by the competition between viral evolution and adapting immune response. In this co-adaptive model, besides the classical error catastrophe for high virus mutation rates, a second ``adaptation-'' catastrophe occurs, when virus mutation rates are too small to escape immune attack. Maximizing both regimes of viral error catastrophes is a possible strategy for an optimal immune response, reducing the range of allowed viral mutation rates to a minimum. From this requirement one obtains constraints on B-cell mutation rates and receptor lengths, yielding an estimate of somatic hypermutation rates in the germinal center in accordance with observation.Comment: 4 pages RevTeX including 2 figure

    Does the Red Queen reign in the kingdom of digital organisms?

    Get PDF
    In competition experiments between two RNA viruses of equal or almost equal fitness, often both strains gain in fitness before one eventually excludes the other. This observation has been linked to the Red Queen effect, which describes a situation in which organisms have to constantly adapt just to keep their status quo. I carried out experiments with digital organisms (self-replicating computer programs) in order to clarify how the competing strains' location in fitness space influences the Red-Queen effect. I found that gains in fitness during competition were prevalent for organisms that were taken from the base of a fitness peak, but absent or rare for organisms that were taken from the top of a peak or from a considerable distance away from the nearest peak. In the latter two cases, either neutral drift and loss of the fittest mutants or the waiting time to the first beneficial mutation were more important factors. Moreover, I found that the Red-Queen dynamic in general led to faster exclusion than the other two mechanisms.Comment: 10 pages, 5 eps figure

    The Error and Repair Catastrophes: A Two-Dimensional Phase Diagram in the Quasispecies Model

    Full text link
    This paper develops a two gene, single fitness peak model for determining the equilibrium distribution of genotypes in a unicellular population which is capable of genetic damage repair. The first gene, denoted by σvia \sigma_{via} , yields a viable organism with first order growth rate constant k>1 k > 1 if it is equal to some target ``master'' sequence σvia,0 \sigma_{via, 0} . The second gene, denoted by σrep \sigma_{rep} , yields an organism capable of genetic repair if it is equal to some target ``master'' sequence σrep,0 \sigma_{rep, 0} . This model is analytically solvable in the limit of infinite sequence length, and gives an equilibrium distribution which depends on \mu \equiv L\eps , the product of sequence length and per base pair replication error probability, and \eps_r , the probability of repair failure per base pair. The equilibrium distribution is shown to exist in one of three possible ``phases.'' In the first phase, the population is localized about the viability and repairing master sequences. As \eps_r exceeds the fraction of deleterious mutations, the population undergoes a ``repair'' catastrophe, in which the equilibrium distribution is still localized about the viability master sequence, but is spread ergodically over the sequence subspace defined by the repair gene. Below the repair catastrophe, the distribution undergoes the error catastrophe when ÎŒ \mu exceeds \ln k/\eps_r , while above the repair catastrophe, the distribution undergoes the error catastrophe when ÎŒ \mu exceeds ln⁥k/fdel \ln k/f_{del} , where fdel f_{del} denotes the fraction of deleterious mutations.Comment: 14 pages, 3 figures. Submitted to Physical Review
    • 

    corecore