32 research outputs found

    Intergenic DNA sequences from the human X chromosome reveal high rates of global gene flow

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite intensive efforts devoted to collecting human polymorphism data, little is known about the role of gene flow in the ancestry of human populations. This is partly because most analyses have applied one of two simple models of population structure, the island model or the splitting model, which make unrealistic biological assumptions.</p> <p>Results</p> <p>Here, we analyze 98-kb of DNA sequence from 20 independently evolving intergenic regions on the X chromosome in a sample of 90 humans from six globally diverse populations. We employ an isolation-with-migration (IM) model, which assumes that populations split and subsequently exchange migrants, to independently estimate effective population sizes and migration rates. While the maximum effective size of modern humans is estimated at ~10,000, individual populations vary substantially in size, with African populations tending to be larger (2,300–9,000) than non-African populations (300–3,300). We estimate mean rates of bidirectional gene flow at 4.8 × 10<sup>-4</sup>/generation. Bidirectional migration rates are ~5-fold higher among non-African populations (1.5 × 10<sup>-3</sup>) than among African populations (2.7 × 10<sup>-4</sup>). Interestingly, because effective sizes and migration rates are inversely related in African and non-African populations, population migration rates are similar within Africa and Eurasia (e.g., global mean Nm = 2.4).</p> <p>Conclusion</p> <p>We conclude that gene flow has played an important role in structuring global human populations and that migration rates should be incorporated as critical parameters in models of human demography.</p

    Full-gene haplotypes refine CYP2D6 metabolizer phenotype inferences

    Get PDF
    CYP2D6 is a critical pharmacogenetic target, and polymorphisms in the gene region are commonly used to infer enzyme activity score and predict resulting metabolizer phenotype: poor, intermediate, extensive/normal, or ultrarapid which can be useful in determining cause and/or manner of death in some autopsies. Current genotyping approaches are incapable of identifying novel and/or rare variants, so CYP2D6 star allele definitions are limited to polymorphisms known a priori. While useful for most predictions, recent studies using massively parallel sequencing data have identified additional polymorphisms in CYP2D6 that are predicted to alter enzyme function but are not considered in current star allele nomenclature. The 1000 Genomes Project data were used to produce full-gene haplotypes, describe their distribution in super-populations, and predict enzyme activity scores. Full-gene haplotypes generated lower activity scores than current approaches due to inclusion of additional damaging polymorphisms in the star allele. These findings are critical for clinical implementation of metabolizer phenotype prediction because a fraction of the population may be incorrectly considered normal metabolizers but actually may be poor or intermediate metabolizers.Peer reviewe

    Sex-Biased Evolutionary Forces Shape Genomic Patterns of Human Diversity

    Get PDF
    Comparisons of levels of variability on the autosomes and X chromosome can be used to test hypotheses about factors influencing patterns of genomic variation. While a tremendous amount of nucleotide sequence data from across the genome is now available for multiple human populations, there has been no systematic effort to examine relative levels of neutral polymorphism on the X chromosome versus autosomes. We analyzed ∼210 kb of DNA sequencing data representing 40 independent noncoding regions on the autosomes and X chromosome from each of 90 humans from six geographically diverse populations. We correct for differences in mutation rates between males and females by considering the ratio of within-human diversity to human-orangutan divergence. We find that relative levels of genetic variation are higher than expected on the X chromosome in all six human populations. We test a number of alternative hypotheses to explain the excess polymorphism on the X chromosome, including models of background selection, changes in population size, and sex-specific migration in a structured population. While each of these processes may have a small effect on the relative ratio of X-linked to autosomal diversity, our results point to a systematic difference between the sexes in the variance in reproductive success; namely, the widespread effects of polygyny in human populations. We conclude that factors leading to a lower male versus female effective population size must be considered as important demographic variables in efforts to construct models of human demographic history and for understanding the forces shaping patterns of human genomic variability

    Autosomal Resequence Data Reveal Late Stone Age Signals of Population Expansion in Sub-Saharan African Foraging and Farming Populations

    Get PDF
    BACKGROUND:A major unanswered question in the evolution of Homo sapiens is when anatomically modern human populations began to expand: was demographic growth associated with the invention of particular technologies or behavioral innovations by hunter-gatherers in the Late Pleistocene, or with the acquisition of farming in the Neolithic? METHODOLOGY/PRINCIPAL FINDINGS:We investigate the timing of human population expansion by performing a multilocus analysis of > or = 20 unlinked autosomal noncoding regions, each consisting of approximately 6 kilobases, resequenced in approximately 184 individuals from 7 human populations. We test the hypothesis that the autosomal polymorphism data fit a simple two-phase growth model, and when the hypothesis is not rejected, we fit parameters of this model to our data using approximate Bayesian computation. CONCLUSIONS/SIGNIFICANCE:The data from the three surveyed non-African populations (French Basque, Chinese Han, and Melanesians) are inconsistent with the simple growth model, presumably because they reflect more complex demographic histories. In contrast, data from all four sub-Saharan African populations fit the two-phase growth model, and a range of onset times and growth rates is inferred for each population. Interestingly, both hunter-gatherers (San and Biaka) and food-producers (Mandenka and Yorubans) best fit models with population growth beginning in the Late Pleistocene. Moreover, our hunter-gatherer populations show a tendency towards slightly older and stronger growth (approximately 41 thousand years ago, approximately 13-fold) than our food-producing populations (approximately 31 thousand years ago, approximately 7-fold). These dates are concurrent with the appearance of the Late Stone Age in Africa, supporting the hypothesis that population growth played a significant role in the evolution of Late Pleistocene human cultures

    The time scale of recombination rate evolution in great apes

    Get PDF
    We present three linkage-disequilibrium (LD)-based recombination maps generated using whole-genome sequence data from 10 Nigerian chimpanzees, 13 bonobos, and 15 western gorillas, collected as part of the Great Ape Genome Project (Prado-Martinez J, et al. 2013. Great ape genetic diversity and population history. Nature 499:471-475). We also identified species-specific recombination hotspots in each group using a modified LDhot framework, which greatly improves statistical power to detect hotspots at varying strengths. We show that fewer hotspots are shared among chimpanzee subspecies than within human populations, further narrowing the time scale of complete hotspot turnover. Further, using species-specific PRDM9 sequences to predict potential binding sites (PBS), we show higher predicted PRDM9 binding in recombination hotspots as compared to matched cold spot regions in multiple great ape species, including at least one chimpanzee subspecies. We found that correlations between broad-scale recombination rates decline more rapidly than nucleotide divergence between species. We also compared the skew of recombination rates at centromeres and telomeres between species and show a skew from chromosome means extending as far as 10-15Mb from chromosome ends. Further, we examined broad-scale recombination rate changes near a translocation in gorillas and found minimal differences as compared to other great ape species perhaps because the coordinates relative to the chromosome ends were unaffected. Finally, on the basis of multiple linear regression analysis, we found that various correlates of recombination rate persist throughout the African great apes including repeats, diversity, and divergence. Our study is the first to analyze within- And between-species genome-wide recombination rate variation in several close relatives

    Flanking Variation Influences Rates of Stutter in Simple Repeats

    No full text
    It has been posited that the longest uninterrupted stretch (LUS) of tandem repeats, as defined by the number of exactly matching repeating motif units, is a better predictor of rates of stutter than the parental allele length (PAL). While there are cases where this hypothesis is likely correct, such as the 9.3 allele in the TH01 locus, there can be situations where it may not apply as well. For example, the PAL may capture flanking indel variations while remaining insensitive to polymorphisms in the repeat, and these haplotypic changes may impact the stutter rate. To address this, rates of stutter were contrasted against the LUS as well as the PAL on different flanking haplotypic backgrounds. This study shows that rates of stutter can vary substantially depending on the flanking haplotype, and while there are cases where the LUS is a better predictor of stutter than the PAL, examples to the contrary are apparent in commonly assayed forensic markers. Further, flanking variation that is 7 bp from the repeat region can impact rates of stutter. These findings suggest that non-proximal effects, such as DNA secondary structure, may be impacting the rates of stutter in common forensic short tandem repeat markers

    Graph Algorithms for Mixture Interpretation

    No full text
    The scale of genetic methods are presently being expanded: forensic genetic assays previously were limited to tens of loci, but now technologies allow for a transition to forensic genomic approaches that assess thousands to millions of loci. However, there are subtle distinctions between genetic assays and their genomic counterparts (especially in the context of forensics). For instance, forensic genetic approaches tend to describe a locus as a haplotype, be it a microhaplotype or a short tandem repeat with its accompanying flanking information. In contrast, genomic assays tend to provide not haplotypes but sequence variants or differences, variants which in turn describe how the alleles apparently differ from the reference sequence. By the given construction, mitochondrial genetic assays can be thought of as genomic as they often describe genetic differences in a similar way. The mitochondrial genetics literature makes clear that sequence differences, unlike the haplotypes they encode, are not comparable to each other. Different alignment algorithms and different variant calling conventions may cause the same haplotype to be encoded in multiple ways. This ambiguity can affect evidence and reference profile comparisons as well as how &ldquo;match&rdquo; statistics are computed. In this study, a graph algorithm is described (and implemented in the MMDIT (Mitochondrial Mixture Database and Interpretation Tool) R package) that permits the assessment of forensic match statistics on mitochondrial DNA mixtures in a way that is invariant to both the variant calling conventions followed and the alignment parameters considered. The algorithm described, given a few modest constraints, can be used to compute the &ldquo;random man not excluded&rdquo; statistic or the likelihood ratio. The performance of the approach is assessed in in silico mitochondrial DNA mixtures
    corecore