122 research outputs found

    A Genealogical Interpretation of Principal Components Analysis

    Get PDF
    Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference

    The rate of beneficial mutations surfing on the wave of a range expansion

    Get PDF
    Many theoretical and experimental studies suggest that range expansions can have severe consequences for the gene pool of the expanding population. Due to strongly enhanced genetic drift at the advancing frontier, neutral and weakly deleterious mutations can reach large frequencies in the newly colonized regions, as if they were surfing the front of the range expansion. These findings raise the question of how frequently beneficial mutations successfully surf at shifting range margins, thereby promoting adaptation towards a range-expansion phenotype. Here, we use individual-based simulations to study the surfing statistics of recurrent beneficial mutations on wave-like range expansions in linear habitats. We show that the rate of surfing depends on two strongly antagonistic factors, the probability of surfing given the spatial location of a novel mutation and the rate of occurrence of mutations at that location. The surfing probability strongly increases towards the tip of the wave. Novel mutations are unlikely to surf unless they enjoy a spatial head start compared to the bulk of the population. The needed head start is shown to be proportional to the inverse fitness of the mutant type, and only weakly dependent on the carrying capacity. The second factor is the mutation occurrence which strongly decreases towards the tip of the wave. Thus, most successful mutations arise at an intermediate position in the front of the wave. We present an analytic theory for the tradeoff between these factors that allows to predict how frequently substitutions by beneficial mutations occur at invasion fronts. We find that small amounts of genetic drift increase the fixation rate of beneficial mutations at the advancing front, and thus could be important for adaptation during species invasions.Comment: 21 pages, 7 figures; to appear in PLoS Computational Biolog

    Complex genetic patterns in human arise from a simple range-expansion model over continental landmasses

    Get PDF
    © 2018 Kanitz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Although it is generally accepted that geography is a major factor shaping human genetic differentiation, it is still disputed how much of this differentiation is a result of a simple process of isolation-by-distance, and if there are factors generating distinct clusters of genetic similarity. We address this question using a geographically explicit simulation framework coupled with an Approximate Bayesian Computation approach. Based on six simple summary statistics only, we estimated the most probable demographic parameters that shaped modern human evolution under an isolation by distance scenario, and found these were the following: an initial population in East Africa spread and grew from 4000 individuals to 5.7 million in about 132 000 years. Subsequent simulations with these estimates followed by cluster analyses produced results nearly identical to those obtained in real data. Thus, a simple diffusion model from East Africa explains a large portion of the genetic diversity patterns observed in modern humans. We argue that a model of isolation by distance along the continental landmasses might be the relevant null model to use when investigating selective effects in humans and probably many other species

    World-wide distributions of lactase persistence alleles and the complex effects of recombination and selection

    Get PDF
    The genetic trait of lactase persistence (LP) is associated with at least five independent functional single nucleotide variants in a regulatory region about 14 kb upstream of the lactase gene [-13910*T (rs4988235), -13907*G (rs41525747), -13915*G (rs41380347), -14009*G (rs869051967) and -14010*C (rs145946881)]. These alleles have been inferred to have spread recently and present-day frequencies have been attributed to positive selection for the ability of adult humans to digest lactose without risk of symptoms of lactose intolerance. One of the inferential approaches used to estimate the level of past selection has been to determine the extent of haplotype homozygosity (EHH) of the sequence surrounding the SNP of interest. We report here new data on the frequencies of the known LP alleles in the 'Old World' and their haplotype lineages. We examine and confirm EHH of each of the LP alleles in relation to their distinct lineages, but also show marked EHH for one of the older haplotypes that does not carry any of the five LP alleles. The region of EHH of this (B) haplotype exactly coincides with a region of suppressed recombination that is detectable in families as well as in population data, and the results show how such suppression may have exaggerated haplotype-based measures of past selection

    Emergence of Spatial Structure in Cell Groups and the Evolution of Cooperation

    Get PDF
    On its own, a single cell cannot exert more than a microscopic influence on its immediate surroundings. However, via strength in numbers and the expression of cooperative phenotypes, such cells can enormously impact their environments. Simple cooperative phenotypes appear to abound in the microbial world, but explaining their evolution is challenging because they are often subject to exploitation by rapidly growing, non-cooperative cell lines. Population spatial structure may be critical for this problem because it influences the extent of interaction between cooperative and non-cooperative individuals. It is difficult for cooperative cells to succeed in competition if they become mixed with non-cooperative cells, which can exploit the public good without themselves paying a cost. However, if cooperative cells are segregated in space and preferentially interact with each other, they may prevail. Here we use a multi-agent computational model to study the origin of spatial structure within growing cell groups. Our simulations reveal that the spatial distribution of genetic lineages within these groups is linked to a small number of physical and biological parameters, including cell growth rate, nutrient availability, and nutrient diffusivity. Realistic changes in these parameters qualitatively alter the emergent structure of cell groups, and thereby determine whether cells with cooperative phenotypes can locally and globally outcompete exploitative cells. We argue that cooperative and exploitative cell lineages will spontaneously segregate in space under a wide range of conditions and, therefore, that cellular cooperation may evolve more readily than naively expected

    The Role of Geography in Human Adaptation

    Get PDF
    Various observations argue for a role of adaptation in recent human evolution, including results from genome-wide studies and analyses of selection signals at candidate genes. Here, we use genome-wide SNP data from the HapMap and CEPH-Human Genome Diversity Panel samples to study the geographic distributions of putatively selected alleles at a range of geographic scales. We find that the average allele frequency divergence is highly predictive of the most extreme FST values across the whole genome. On a broad scale, the geographic distribution of putatively selected alleles almost invariably conforms to population clusters identified using randomly chosen genetic markers. Given this structure, there are surprisingly few fixed or nearly fixed differences between human populations. Among the nearly fixed differences that do exist, nearly all are due to fixation events that occurred outside of Africa, and most appear in East Asia. These patterns suggest that selection is often weak enough that neutral processes—especially population history, migration, and drift—exert powerful influences over the fate and geographic distribution of selected alleles

    Fine-Scale Genetic Structure Arises during Range Expansion of an Invasive Gecko

    Get PDF
    Processes of range expansion are increasingly important in light of current concerns about invasive species and range shifts due to climate change. Theoretical studies suggest that genetic structuring may occur during range expansion. Ephemeral genetic structure can have important evolutionary implications, such as propagating genetic changes along the wave front of expansion, yet few studies have shown evidence of such structure. We tested the hypothesis that genetic structure arises during range expansion in Hemidactylus mabouia, a nocturnal African gecko recently introduced to Florida, USA. Twelve highly variable microsatellite loci were used to screen 418 individuals collected from 43 locations from four sampling sites across Florida, representing a gradient from earlier (∼1990s) to very recent colonization. We found earlier colonized locations had little detectable genetic structure and higher allelic richness than more recently colonized locations. Genetic structuring was pronounced among locations at spatial scales of tens to hundreds of meters near the leading edge of range expansion. Despite the rapid pace of range expansion in this introduced gecko, dispersal is limited among many suitable habitat patches. Fine-scale genetic structure is likely the result of founder effects during colonization of suitable habitat patches. It may be obscured over time and by scale-dependent modes of dispersal. Further studies are needed to determine if such genetic structure affects adaptation and trait evolution in range expansions and range shifts

    The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

    Get PDF
    BACKGROUND: Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. METHODOLOGY/PRINCIPAL FINDINGS: We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. CONCLUSIONS/SIGNIFICANCE: Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for phylogenetic questions of osteichthyans at different taxonomic levels
    corecore