689 research outputs found
Probabilistic Clustering of Sequences: Inferring new bacterial regulons by comparative genomics
Genome wide comparisons between enteric bacteria yield large sets of
conserved putative regulatory sites on a gene by gene basis that need to be
clustered into regulons. Using the assumption that regulatory sites can be
represented as samples from weight matrices we derive a unique probability
distribution for assignments of sites into clusters. Our algorithm, 'PROCSE'
(probabilistic clustering of sequences), uses Monte-Carlo sampling of this
distribution to partition and align thousands of short DNA sequences into
clusters. The algorithm internally determines the number of clusters from the
data, and assigns significance to the resulting clusters. We place theoretical
limits on the ability of any algorithm to correctly cluster sequences drawn
from weight matrices (WMs) when these WMs are unknown. Our analysis suggests
that the set of all putative sites for a single genome (e.g. E. coli) is
largely inadequate for clustering. When sites from different genomes are
combined and all the homologous sites from the various species are used as a
block, clustering becomes feasible. We predict 50-100 new regulons as well as
many new members of existing regulons, potentially doubling the number of known
regulatory sites in E. coli.Comment: 27 pages including 9 figures and 3 table
Coupled Replicator Equations for the Dynamics of Learning in Multiagent Systems
Starting with a group of reinforcement-learning agents we derive coupled
replicator equations that describe the dynamics of collective learning in
multiagent systems. We show that, although agents model their environment in a
self-interested way without sharing knowledge, a game dynamics emerges
naturally through environment-mediated interactions. An application to
rock-scissors-paper game interactions shows that the collective learning
dynamics exhibits a diversity of competitive and cooperative behaviors. These
include quasiperiodicity, stable limit cycles, intermittency, and deterministic
chaos--behaviors that should be expected in heterogeneous multiagent systems
described by the general replicator equations we derive.Comment: 4 pages, 3 figures,
http://www.santafe.edu/projects/CompMech/papers/credlmas.html; updated
references, corrected typos, changed conten
Evolutionary games and quasispecies
We discuss a population of sequences subject to mutations and
frequency-dependent selection, where the fitness of a sequence depends on the
composition of the entire population. This type of dynamics is crucial to
understand the evolution of genomic regulation. Mathematically, it takes the
form of a reaction-diffusion problem that is nonlinear in the population state.
In our model system, the fitness is determined by a simple mathematical game,
the hawk-dove game. The stationary population distribution is found to be a
quasispecies with properties different from those which hold in fixed fitness
landscapes.Comment: 7 pages, 2 figures. Typos corrected, references updated. An exact
solution for the hawks-dove game is provide
On the Neutrality of Flowshop Scheduling Fitness Landscapes
Solving efficiently complex problems using metaheuristics, and in particular
local searches, requires incorporating knowledge about the problem to solve. In
this paper, the permutation flowshop problem is studied. It is well known that
in such problems, several solutions may have the same fitness value. As this
neutrality property is an important one, it should be taken into account during
the design of optimization methods. Then in the context of the permutation
flowshop, a deep landscape analysis focused on the neutrality property is
driven and propositions on the way to use this neutrality to guide efficiently
the search are given.Comment: Learning and Intelligent OptimizatioN Conference (LION 5), Rome :
Italy (2011
Joint scaling laws in functional and evolutionary categories in prokaryotic genomes
We propose and study a class-expansion/innovation/loss model of genome
evolution taking into account biological roles of genes and their constituent
domains. In our model numbers of genes in different functional categories are
coupled to each other. For example, an increase in the number of metabolic
enzymes in a genome is usually accompanied by addition of new transcription
factors regulating these enzymes. Such coupling can be thought of as a
proportional "recipe" for genome composition of the type "a spoonful of sugar
for each egg yolk". The model jointly reproduces two known empirical laws: the
distribution of family sizes and the nonlinear scaling of the number of genes
in certain functional categories (e.g. transcription factors) with genome size.
In addition, it allows us to derive a novel relation between the exponents
characterising these two scaling laws, establishing a direct quantitative
connection between evolutionary and functional categories. It predicts that
functional categories that grow faster-than-linearly with genome size to be
characterised by flatter-than-average family size distributions. This relation
is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves
that the joint quantitative trends of functional and evolutionary classes can
be understood in terms of evolutionary growth with proportional recipes.Comment: 39 pages, 21 figure
Co-Evolution of quasispecies: B-cell mutation rates maximize viral error catastrophes
Co-evolution of two coupled quasispecies is studied, motivated by the
competition between viral evolution and adapting immune response. In this
co-adaptive model, besides the classical error catastrophe for high virus
mutation rates, a second ``adaptation-'' catastrophe occurs, when virus
mutation rates are too small to escape immune attack. Maximizing both regimes
of viral error catastrophes is a possible strategy for an optimal immune
response, reducing the range of allowed viral mutation rates to a minimum. From
this requirement one obtains constraints on B-cell mutation rates and receptor
lengths, yielding an estimate of somatic hypermutation rates in the germinal
center in accordance with observation.Comment: 4 pages RevTeX including 2 figure
Does the Red Queen reign in the kingdom of digital organisms?
In competition experiments between two RNA viruses of equal or almost equal
fitness, often both strains gain in fitness before one eventually excludes the
other. This observation has been linked to the Red Queen effect, which
describes a situation in which organisms have to constantly adapt just to keep
their status quo. I carried out experiments with digital organisms
(self-replicating computer programs) in order to clarify how the competing
strains' location in fitness space influences the Red-Queen effect. I found
that gains in fitness during competition were prevalent for organisms that were
taken from the base of a fitness peak, but absent or rare for organisms that
were taken from the top of a peak or from a considerable distance away from the
nearest peak. In the latter two cases, either neutral drift and loss of the
fittest mutants or the waiting time to the first beneficial mutation were more
important factors. Moreover, I found that the Red-Queen dynamic in general led
to faster exclusion than the other two mechanisms.Comment: 10 pages, 5 eps figure
The Error and Repair Catastrophes: A Two-Dimensional Phase Diagram in the Quasispecies Model
This paper develops a two gene, single fitness peak model for determining the
equilibrium distribution of genotypes in a unicellular population which is
capable of genetic damage repair. The first gene, denoted by ,
yields a viable organism with first order growth rate constant if it
is equal to some target ``master'' sequence . The second
gene, denoted by , yields an organism capable of genetic repair
if it is equal to some target ``master'' sequence . This
model is analytically solvable in the limit of infinite sequence length, and
gives an equilibrium distribution which depends on \mu \equiv L\eps , the
product of sequence length and per base pair replication error probability, and
\eps_r , the probability of repair failure per base pair. The equilibrium
distribution is shown to exist in one of three possible ``phases.'' In the
first phase, the population is localized about the viability and repairing
master sequences. As \eps_r exceeds the fraction of deleterious mutations,
the population undergoes a ``repair'' catastrophe, in which the equilibrium
distribution is still localized about the viability master sequence, but is
spread ergodically over the sequence subspace defined by the repair gene. Below
the repair catastrophe, the distribution undergoes the error catastrophe when exceeds \ln k/\eps_r , while above the repair catastrophe, the
distribution undergoes the error catastrophe when exceeds , where denotes the fraction of deleterious mutations.Comment: 14 pages, 3 figures. Submitted to Physical Review
- âŠ