152 research outputs found
The Expected Sample Allele Frequencies from Populations of Changing Size via Orthogonal Polynomials
In this article, discrete and stochastic changes in (effective) population
size are incorporated into the spectral representation of a biallelic diffusion
process for drift and small mutation rates. A forward algorithm inspired by
Hidden-Markov-Model (HMM) literature is used to compute exact sample allele
frequency spectra for three demographic scenarios: single changes in
(effective) population size, boom-bust dynamics, and stochastic fluctuations in
(effective) population size. An approach for fully agnostic demographic
inference from these sample allele spectra is explored, and sufficient
statistics for step-wise changes in population size are found. Further,
convergence behaviours of the polymorphic sample spectra for population size
changes on different time scales are examined and discussed within the context
of inference of the effective population size. Joint visual assessment of the
sample spectra and the temporal coefficients of the spectral decomposition of
the forward diffusion process is found to be important in determining departure
from equilibrium. Stochastic changes in (effective) population size are shown
to shape sample spectra particularly strongly
The expected sample allele frequencies from populations of changing size via orthogonal polynomials
Funding: CV’s research was supported by the Austrian Science Fund (FWF): DK W1225-B20; LCM’s by the School of Biology at the University of St. Andrews.In this article, discrete and stochastic changes in (effective) population size are incorporated into the spectral representation of a biallelic diffusion process for drift and small mutation rates. A forward algorithm inspired by Hidden-Markov-Model (HMM) literature is used to compute exact sample allele frequency spectra for three demographic scenarios: single changes in (effective) population size, boom-bust dynamics, and stochastic fluctuations in (effective) population size. An approach for fully agnostic demographic inference from these sample allele spectra is explored, and sufficient statistics for stepwise changes in population size are found. Further, convergence behaviours of the polymorphic sample spectra for population size changes on different time scales are examined and discussed within the context of inference of the effective population size. Joint visual assessment of the sample spectra and the temporal coefficients of the spectral decomposition of the forward diffusion process is found to be important in determining departure from equilibrium. Stochastic changes in (effective) population size are shown to shape sample spectra particularly strongly.Peer reviewe
Maximum likelihood (ML) estimators for scaled mutation parameters with a strand symmetric mutation model in equilibrium
With the multiallelic parent-independent mutation-drift model, the
equilibrium proportions of alleles are known to be Dirichlet distributed. A
special case is the biallelic model, in which the proportions are beta
distributed. A sample taken from these models is then Dirichlet-multinomially
or beta-binomially distributed, respectively. Maximum likelihood (ML)
estimators for the mutation parameters of the biallelic parent-independent
mutation model are available via an expectation maximization algorithm.
Assuming small scaled mutation rates, the distribution of a sample of size
can be expanded in a Taylor series of first order. Then the ML estimators for
the two parameters in the biallelic model can be expressed using the site
frequency spectrum. In this article, we go beyond parent-independent mutation
and analyse a strand-symmetric mutation model with six scaled mutation
parameters that deviates from parent independent mutation and, generally, from
detailed balance. We derive ML estimators for these six parameters assuming
mutation-drift equilibrium and small scaled mutation rates. This is the first
time that ML estimators are provided for a mutation model more complex than
parent-independent mutation
Maximum likelihood estimators for scaled mutation rates in an equilibrium mutation-drift model
The stationary sampling distribution of a neutral decoupled Moran or
Wright-Fisher diffusion with neutral mutations is known to first order for a
general rate matrix with small but otherwise unconstrained mutation rates.
Using this distribution as a starting point we derive results for maximum
likelihood estimates of scaled mutation rates from site frequency data under
three model assumptions: a twelve-parameter general rate matrix, a
nine-parameter reversible rate matrix, and a six-parameter strand-symmetric
rate matrix. The site frequency spectrum is assumed to be sampled from a fixed
size population in equilibrium, and to consist of allele frequency data at a
large number of unlinked sites evolving with a common mutation rate matrix
without selective bias. We correct an error in a previous treatment of the same
problem (Burden and Tang, 2017) affecting the estimators for the general and
strand-symmetric rate matrices. The method is applied to a biological dataset
consisting of a site frequency spectrum extracted from short autosomal introns
in a sample of Drosophila melanogaster individuals.Comment: 39 pages, 4 figures, simulation to test accuracy of the model adde
Nuclear and plastid haplotypes suggest rapid diploid and polyploid speciation in the N Hemisphere Achillea millefolium complex (Asteraceae)
<p>Abstract</p> <p>Background</p> <p>Species complexes or aggregates consist of a set of closely related species often of different ploidy levels, whose relationships are difficult to reconstruct. The N Hemisphere <it>Achillea millefolium </it>aggregate exhibits complex morphological and genetic variation and a broad ecological amplitude. To understand its evolutionary history, we study sequence variation at two nuclear genes and three plastid loci across the natural distribution of this species complex and compare the patterns of such variations to the species tree inferred earlier from AFLP data.</p> <p>Results</p> <p>Among the diploid species of <it>A. millefolium </it>agg., gene trees of the two nuclear loci, ncp<it>GS </it>and <it>SBP</it>, and the combined plastid fragments are incongruent with each other and with the AFLP tree likely due to incomplete lineage sorting or secondary introgression. In spite of the large distributional range, no isolation by distance is found. Furthermore, there is evidence for intragenic recombination in the ncp<it>GS </it>gene. An analysis using a probabilistic model for population demographic history indicates large ancestral effective population sizes and short intervals between speciation events. Such a scenario explains the incongruence of the gene trees and species tree we observe. The relationships are particularly complex in the polyploid members of <it>A. millefolium </it>agg.</p> <p>Conclusions</p> <p>The present study indicates that the diploid members of <it>A. millefolium </it>agg. share a large part of their molecular genetic variation. The findings of little lineage sorting and lack of isolation by distance is likely due to short intervals between speciation events and close proximity of ancestral populations. While previous AFLP data provide species trees congruent with earlier morphological classification and phylogeographic considerations, the present sequence data are not suited to recover the relationships of diploid species in <it>A. millefolium </it>agg. For the polyploid taxa many hybrid links and introgression from the diploids are suggested.</p
Allopolyploid speciation and ongoing backcrossing between diploid progenitor and tetraploid progeny lineages in the Achillea millefolium species complex: analyses of single-copy nuclear genes and genomic AFLP
<p>Abstract</p> <p>Background</p> <p>In the flowering plants, many polyploid species complexes display evolutionary radiation. This could be facilitated by gene flow between otherwise separate evolutionary lineages in contact zones. <it>Achillea collina </it>is a widespread tetraploid species within the <it>Achillea millefolium </it>polyploid complex (Asteraceae-Anthemideae). It is morphologically intermediate between the relic diploids, <it>A. setacea</it>-2x in xeric and <it>A. asplenifolia</it>-2x in humid habitats, and often grows in close contact with either of them. By analyzing DNA sequences of two single-copy nuclear genes and the genomic AFLP data, we assess the allopolyploid origin of <it>A. collina</it>-4x from ancestors corresponding to <it>A. setacea</it>-2x and <it>A. asplenifolia</it>-2x, and the ongoing backcross introgression between these diploid progenitor and tetraploid progeny lineages.</p> <p>Results</p> <p>In both the ncp<it>GS </it>and the <it>PgiC </it>gene tree, haplotype sequences of the diploid <it>A. setacea</it>-2x and <it>A. asplenifolia</it>-2x group into two clades corresponding to the two species, though lineage sorting seems incomplete for the <it>PgiC </it>gene. In contrast, <it>A. collina</it>-4x and its suspected backcross plants show homeologous gene copies: sequences from the same tetraploid individual plant are placed in both diploid clades. Semi-congruent splits of an AFLP Neighbor Net link not only <it>A. collina</it>-4x to both diploid species, but some 4x individuals in a polymorphic population with mixed ploidy levels to <it>A. setacea</it>-2x on one hand and to <it>A. collina</it>-4x on the other, indicating allopolyploid speciation as well as hybridization across ploidal levels.</p> <p>Conclusions</p> <p>The findings of this study clearly demonstrate the hybrid origin of <it>Achillea collina</it>-4x, the ongoing backcrossing between the diploid progenitor and their tetraploid progeny lineages. Such repeated hybridizations are likely the cause of the great genetic and phenotypic variation and ecological differentiation of the polyploid taxa in <it>Achillea millefolium </it>agg.</p
- …