4 research outputs found
Simultaneous Inference of Past Demography and Selection from the Ancestral Recombination Graph under the Beta Coalescent
The reproductive mechanism of a species is a key driver of genome evolution. The standard Wright-Fisher model for the reproduction of individuals in a population assumes that each individual produces a number of offspring negligible compared to the total population size. Yet many species of plants, invertebrates, prokaryotes or fish exhibit neutrally skewed offspring distribution or strong selection events yielding few individuals to produce a number of offspring of up to the same magnitude as the population size. As a result, the genealogy of a sample is characterized by multiple individuals (more than two) coalescing simultaneously to the same common ancestor. The current methods developed to detect such multiple merger events do not account for complex demographic scenarios or recombination, and require large sample sizes. We tackle these limitations by developing two novel and different approaches to infer multiple merger events from sequence data or the ancestral recombination graph (ARG): a sequentially Markovian coalescent (SMβC) and a graph neural network (GNNcoal). We first give proof of the accuracy of our methods to estimate the multiple merger parameter and past demographic history using simulated data under the β-coalescent model. Secondly, we show that our approaches can also recover the effect of positive selective sweeps along the genome. Finally, we are able to distinguish skewed offspring distribution from selection while simultaneously inferring the past variation of population size. Our findings stress the aptitude of neural networks to leverage information from the ARG for inference but also the urgent need for more accurate ARG inference approaches.</p
Population Genomic Evidence for a Repeated Introduction and Rapid Expansion of the Fungal Maize Pathogen Setosphaeria turcica in Europe
Modern agricultural practices, climate change, and globalization foster the rapid spread of plant pathogens, such as the maize fungal pathogen Setosphaeria turcica, which causes Northern corn leaf blight and expanded into Central Europe during the twentieth century. To investigate the rapid expansion of S. turcica, we sequenced 121 isolates from Europe and Kenya. Population genomic inference revealed a single genetically diverse cluster in Kenya and three clonal lineages with low diversity, as well as one cluster of multiple clonal sublineages in Europe. Phylogenetic dating suggests that all European lineages originated through sexual reproduction outside Europe and were subsequently introgressed multiple times. Unlike isolates from Kenya, European isolates did not show sexual recombination, despite the presence of both MAT1-1 and MAT1-2 mating types. For the clonal lineages, coalescent model selection supported a selectively neutral model with strong exponential population growth, rather than models with pervasive positive selection caused by host defense resistance or environmental adaptation. Within clonal lineages, phenotypic variation in virulence to different monogenic resistances, which defines the pathogen races, suggests that these races may originate from repeated mutations in virulence genes. Association testing based on k-mers did not identify genomic regions linked to pathogen races, but it did uncover strongly differentiated genomic regions between clonal lineages, which harbor genes with putative roles in pathogenicity. In conclusion, the expansion and population growth of S. turcica in Europe are mainly driven by an expansion of the maize cultivation area and not by rapid adaptation
Interpreting the pervasive observation of U-shaped Site Frequency Spectra.
The standard neutral model of molecular evolution has traditionally been used as the null model for population genomics. We gathered a collection of 45 genome-wide site frequency spectra from a diverse set of species, most of which display an excess of low and high frequency variants compared to the expectation of the standard neutral model, resulting in U-shaped spectra. We show that multiple merger coalescent models often provide a better fit to these observations than the standard Kingman coalescent. Hence, in many circumstances these under-utilized models may serve as the more appropriate reference for genomic analyses. We further discuss the underlying evolutionary processes that may result in the widespread U-shape of frequency spectra.</p