113 research outputs found
Waiting for regulatory sequences to appear
One possible explanation for the substantial organismal differences between
humans and chimpanzees is that there have been changes in gene regulation.
Given what is known about transcription factor binding sites, this motivates
the following probability question: given a 1000 nucleotide region in our
genome, how long does it take for a specified six to nine letter word to appear
in that region in some individual? Stone and Wray [Mol. Biol. Evol. 18 (2001)
1764--1770] computed 5,950 years as the answer for six letter words. Here, we
will show that for words of length 6, the average waiting time is 100,000
years, while for words of length 8, the waiting time has mean 375,000 years
when there is a 7 out of 8 letter match in the population consensus sequence
(an event of probability roughly 5/16) and has mean 650 million years when
there is not. Fortunately, in biological reality, the match to the target word
does not have to be perfect for binding to occur. If we model this by saying
that a 7 out of 8 letter match is good enough, the mean reduces to about 60,000
years.Comment: Published at http://dx.doi.org/10.1214/105051606000000619 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Phase transitions for a planar quadratic contact process
We study a two dimensional version of Neuhauser's long range sexual
reproduction model and prove results that give bounds on the critical values
for the process to survive from a finite set and for
the existence of a nontrivial stationary distribution. Our first result comes
from a standard block construction, while the second involves a comparison with
the "generic population model" of Bramson and Gray (1991). An interesting new
feature of our work is the suggestion that, as in the one dimensional contact
process, edge speeds characterize critical values. We are able to prove the
following for our quadratic contact process when the range is large but suspect
they are true for two dimensional finite range attractive particle systems that
are symmetric with respect to reflection in each axis. There is a speed
for the expansion of the process in each direction. If in all directions, then , while if at least one speed
is positive, then . It is a challenging open problem to
show that if some speed is negative, then the system dies out from any finite
set
The stepping stone model. II: Genealogies and the infinite sites model
This paper extends earlier work by Cox and Durrett, who studied the
coalescence times for two lineages in the stepping stone model on the
two-dimensional torus. We show that the genealogy of a sample of size n is
given by a time change of Kingman's coalescent. With DNA sequence data in mind,
we investigate mutation patterns under the infinite sites model, which assumes
that each mutation occurs at a new site. Our results suggest that the spatial
structure of the human population contributes to the haplotype structure and a
slower than expected decay of genetic correlation with distance revealed by
recent studies of the human genome.Comment: Published at http://dx.doi.org/10.1214/105051604000000701 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Voter Model Perturbations and Reaction Diffusion Equations
We consider particle systems that are perturbations of the voter model and
show that when space and time are rescaled the system converges to a solution
of a reaction diffusion equation in dimensions . Combining this result
with properties of the PDE, some methods arising from a low density
super-Brownian limit theorem, and a block construction, we give general, and
often asymptotically sharp, conditions for the existence of non-trivial
stationary distributions, and for extinction of one type. As applications, we
describe the phase diagrams of three systems when the parameters are close to
the voter model: (i) a stochastic spatial Lotka-Volterra model of Neuhauser and
Pacala, (ii) a model of the evolution of cooperation of Ohtsuki, Hauert,
Lieberman, and Nowak, and (iii) a continuous time version of the non-linear
voter model of Molofsky, Durrett, Dushoff, Griffeath, and Levin. The first
application confirms a conjecture of Cox and Perkins and the second confirms a
conjecture of Ohtsuki et al in the context of certain infinite graphs. An
important feature of our general results is that they do not require the
process to be attractive.Comment: 106 pages, 7 figure
Voter Model Perturbations and Reaction Diffusion Equations
We consider particle systems that are perturbations of the voter model and show that when space and time are rescaled the system converges to a solution of a reaction diffusion equation in dimensions d \u3e 3. Combining this result with properties of the PDE, some methods arising from a low density super-Brownian limit theorem, and a block construction, we give general, and often asymptotically sharp, conditions for the existence of non-trivial stationary distributions, and for extinction of one type. As applications, we describe the phase diagrams of three systems when the parameters are close to the voter model: (i) a stochastic spatial Lotka-Volterra model of Neuhauser and Pacala, (ii) a model of the evolution of cooperation of Ohtsuki, Hauert, Lieberman, and Nowak, and (iii) a continuous time version of the non-linear voter model of Molofsky, Durrett, Dushoff, Griffeath, and Levin. The first application confirms a conjecture of Cox and Perkins and the second confirms a conjecture of Ohtsuki et al in the context of certain infinite graphs. An important feature of our general results is that they do not require the process to be attractive
Simple models of genomic variation in human SNP density
<p>Abstract</p> <p>Background</p> <p>Descriptive hierarchical Poisson models and population-genetic coalescent mixture models are used to describe the observed variation in single-nucleotide polymorphism (SNP) density from samples of size two across the human genome.</p> <p>Results</p> <p>Using empirical estimates of recombination rate across the human genome and the observed SNP density distribution, we produce a maximum likelihood estimate of the genomic heterogeneity in the scaled mutation rate <it>θ</it>. Such models produce significantly better fits to the observed SNP density distribution than those that ignore the empirically observed recombinational heterogeneities.</p> <p>Conclusion</p> <p>Accounting for mutational and recombinational heterogeneities can allow for empirically sound null distributions in genome scans for "outliers", when the alternative hypotheses include fundamentally historical and unobserved phenomena.</p
Duality and perfect probability spaces
Abstract. Given probability spaces (Xi, Ai,Pi),i =1,2,let M(P1,P2)denote the set of all probabilities on the product space with marginals P1 and P2 and let h be a measurable function on (X1 × X2, A1 ⊗A2). Continuous versions of linear programming stemming from the works of Monge (1781) and Kantorovich-Rubinˇstein (1958) for the case of compact metric spaces are concerned with the validity of the duality sup { hdP:P∈M(P1,P2)
- …