60 research outputs found
The process of most recent common ancestors in an evolving coalescent
Consider a haploid population which has evolved through an exchangeable
reproduction dynamics, and in which all individuals alive at time have a
most recent common ancestor (MRCA) who lived at time , say. As time goes
on, not only the population but also its genealogy evolves: some families will
get lost from the population and eventually a new MRCA will be established. For
a time-stationary situation and in the limit of infinite population size
with time measured in generations, i.e. in the scaling of population
genetics which leads to Fisher-Wright diffusions and Kingman's coalescent, we
study the process whose jumps form the point process of
time pairs when new MRCAs are established and when they lived. By
representing these pairs as the entrance and exit time of particles whose
trajectories are embedded in the look-down graph of Donnelly and Kurtz (1999)
we can show by exchangeability arguments that the times as well as the
times from a Poisson process. Furthermore, the particle representation
helps to compute various features of the MRCA process, such as the distribution
of the coalescent at the instant when a new MRCA is established, and the
distribution of the number of MRCAs to come that live in today's past
Inference of historical population-size changes with allele-frequency data
With up to millions of nearly neutral polymorphisms now being routinely sampled in population-genomic surveys, it is possible to estimate the site-frequency spectrum of such sites with high precision. Each frequency class reflects a mixture of potentially unique demographic histories, which can be revealed using theory for the probability distributions of the starting and ending points of branch segments over all possible coalescence trees. Such distributions are completely independent of past population history, which only influences the segment lengths, providing the basis for estimating average population sizes separating tree-wide coalescence events. The history of population-size change experienced by a sample of polymorphisms can then be dissected in a model-flexible fashion, and extension of this theory allows estimation of the mean and full distribution of long-term effective population sizes and ages of alleles of specific frequencies. Here, we outline the basic theory underlying the conceptual approach, develop and test an efficient statistical procedure for parameter estimation, and apply this to multiple population-genomic datasets for the microcrustacean Daphnia pulex
The diversity of a distributed genome in bacterial populations
The distributed genome hypothesis states that the set of genes in a
population of bacteria is distributed over all individuals that belong to the
specific taxon. It implies that certain genes can be gained and lost from
generation to generation. We use the random genealogy given by a Kingman
coalescent in order to superimpose events of gene gain and loss along ancestral
lines. Gene gains occur at a constant rate along ancestral lines. We assume
that gained genes have never been present in the population before. Gene losses
occur at a rate proportional to the number of genes present along the ancestral
line. In this infinitely many genes model we derive moments for several
statistics within a sample: the average number of genes per individual, the
average number of genes differing between individuals, the number of
incongruent pairs of genes, the total number of different genes in the sample
and the gene frequency spectrum. We demonstrate that the model gives a
reasonable fit with gene frequency data from marine cyanobacteria.Comment: Published in at http://dx.doi.org/10.1214/09-AAP657 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Single-crossover dynamics: finite versus infinite populations
Populations evolving under the joint influence of recombination and
resampling (traditionally known as genetic drift) are investigated. First, we
summarise and adapt a deterministic approach, as valid for infinite
populations, which assumes continuous time and single crossover events. The
corresponding nonlinear system of differential equations permits a closed
solution, both in terms of the type frequencies and via linkage disequilibria
of all orders. To include stochastic effects, we then consider the
corresponding finite-population model, the Moran model with single crossovers,
and examine it both analytically and by means of simulations. Particular
emphasis is on the connection with the deterministic solution. If there is only
recombination and every pair of recombined offspring replaces their pair of
parents (i.e., there is no resampling), then the {\em expected} type
frequencies in the finite population, of arbitrary size, equal the type
frequencies in the infinite population. If resampling is included, the
stochastic process converges, in the infinite-population limit, to the
deterministic dynamics, which turns out to be a good approximation already for
populations of moderate size.Comment: 21 pages, 4 figure
Estimating Parameters of Speciation Models Based on Refined Summaries of the Joint Site-Frequency Spectrum
Understanding the processes and conditions under which populations diverge to give rise to distinct species is a central question in evolutionary biology. Since recently diverged populations have high levels of shared polymorphisms, it is challenging to distinguish between recent divergence with no (or very low) inter-population gene flow and older splitting events with subsequent gene flow. Recently published methods to infer speciation parameters under the isolation-migration framework are based on summarizing polymorphism data at multiple loci in two species using the joint site-frequency spectrum (JSFS). We have developed two improvements of these methods based on a more extensive use of the JSFS classes of polymorphisms for species with high intra-locus recombination rates. First, using a likelihood based method, we demonstrate that taking into account low-frequency polymorphisms shared between species significantly improves the joint estimation of the divergence time and gene flow between species. Second, we introduce a local linear regression algorithm that considerably reduces the computational time and allows for the estimation of unequal rates of gene flow between species. We also investigate which summary statistics from the JSFS allow the greatest estimation accuracy for divergence time and migration rates for low (around 10) and high (around 100) numbers of loci. Focusing on cases with low numbers of loci and high intra-locus recombination rates we show that our methods for the estimation of divergence time and migration rates are more precise than existing approaches
Generalized Poisson Summation Formulas for Continuous Functions of Polynomial Growth
The Poisson summation formula (PSF) describes the equivalence between the sampling of an analog signal and the periodization of its frequency spectrum. In engineering textbooks, the PSF is usually stated formally without explicit conditions on the signal for the formula to hold. By contrast, in the mathematics literature, the PSF is commonly stated and proven in the pointwise sense for various types of signals. This assumption is, however, too restrictive for many signal-processing tasks that demand the sampling of possibly growing signals. In this paper, we present two generalized versions of the PSF for d-dimensional signals of polynomial growth. In the first generalization, we show that the PSF holds in the space of tempered distributions for every continuous and polynomially growing signal. In the second generalization, the PSF holds in a particular negative-order Sobolev space if we further require that d∕2 + ε derivatives of the signal are bounded by some polynomial in the sense
- …