185 research outputs found

    Localization of adaptive variants in human genomes using averaged one-dependence estimation.

    Get PDF
    Statistical methods for identifying adaptive mutations from population genetic data face several obstacles: assessing the significance of genomic outliers, integrating correlated measures of selection into one analytic framework, and distinguishing adaptive variants from hitchhiking neutral variants. Here, we introduce SWIF(r), a probabilistic method that detects selective sweeps by learning the distributions of multiple selection statistics under different evolutionary scenarios and calculating the posterior probability of a sweep at each genomic site. SWIF(r) is trained using simulations from a user-specified demographic model and explicitly models the joint distributions of selection statistics, thereby increasing its power to both identify regions undergoing sweeps and localize adaptive mutations. Using array and exome data from 45 ‡Khomani San hunter-gatherers of southern Africa, we identify an enrichment of adaptive signals in genes associated with metabolism and obesity. SWIF(r) provides a transparent probabilistic framework for localizing beneficial mutations that is extensible to a variety of evolutionary scenarios

    A Tale of Two Haplotypes: The \u3cem\u3eEDA2R/AR\u3c/em\u3e Intergenic Region is the most Divergent Genomic Segment between Africans and East Asians in the Human Genome

    Get PDF
    Single nucleotide polymorphisms (SNPs) with large allele frequency differences between human populations are relatively rare. The longest run of SNPs with an allele frequency difference of one between the Yoruba of Nigeria and the Han Chinese is found on the long arm of the X chromosome in the intergenic region separating the EDA2R and AR genes. It has been proposed that the unusual allele frequency distributions of these SNPs are the result of a selective sweep affecting African populations that occurred after the Out-of-Africa migration. To investigate the evolutionary history of the EDA2R/AR intergenic region, we characterized the haplotype structure of 52 of its highly-differentiated SNPs. Using a publicly-available dataset of 3,000 X chromosomes from 65 human populations, we found that nearly all human X chromosomes carry one of two modal haplotypes for these 52 SNPs. The predominance of two highly divergent haplotypes at this locus was confirmed using a subset of individuals sequenced to high coverage. The first of these haplotypes, the α haplotype, is at high frequencies in most of the African populations surveyed and likely arose prior to the separation of African populations into distinct genetic entities. The second, the β haplotype, is frequent or fixed in all non-African populations and likely arose in East Africa prior to the Out-of-Africa migration. We also observed a small group of rare haplotypes with no clear relationship to the α and β haplotypes. These haplotypes occur at relatively high frequencies in African hunter-gatherer populations, like the San and Mbuti Pygmies. Our analysis indicates that these haplotypes are part of a pool of diverse, ancestral haplotypes that have now been almost entirely replaced by the α and β haplotypes. We suggest that the rise of the α and β haplotypes was the result of the demographic forces that human populations experienced during the formation of modern African populations and the Out-of-Africa migration. However, we also present evidence that this region is the target of selection in the form of positive selection on the α and β haplotypes and of purifying selection against α/β recombinants

    Determining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method

    Get PDF
    Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes

    A Panel of Ancestry Informative Markers for the Complex Five-Way Admixed South African Coloured Population

    Get PDF
    Admixture is a well known confounder in genetic association studies. If genome-wide data is not available, as would be the case for candidate gene studies, ancestry informative markers (AIMs) are required in order to adjust for admixture. The predominant population group in the Western Cape, South Africa, is the admixed group known as the South African Coloured (SAC). A small set of AIMs that is optimized to distinguish between the five source populations of this population (African San, African non-San, European, South Asian, and East Asian) will enable researchers to cost-effectively reduce false-positive findings resulting from ignoring admixture in genetic association studies of the population. Using genome-wide data to find SNPs with large allele frequency differences between the source populations of the SAC, as quantified by Rosenberg et. al's -statistic, we developed a panel of AIMs by experimenting with various selection strategies. Subsets of different sizes were evaluated by measuring the correlation between ancestry proportions estimated by each AIM subset with ancestry proportions estimated using genome-wide data. We show that a panel of 96 AIMs can be used to assess ancestry proportions and to adjust for the confounding effect of the complex five-way admixture that occurred in the South African Coloured population.Department of HE and Training approved lis

    Determining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method

    Get PDF
    Publication of this article was funded by the Stellenbosch University Open Access Fund.The original publication is available at http://www.plosone.org/Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes. We applied this approach to a complex, uniquely admixed South African population. Using genome-wide SNP data from over 764 individuals, we accurately estimate the genetic contributions from the best ancestral populations: isiXhosa (33%±0:226), {Khomani SAN (31%±0:195), European (16%±0:118), Indian (13%±0:094), and Chinese (7%±0:0488). We also demonstrate that the ancestral allele frequency differences correlate with increased linkage disequilibrium in the South African population, which originates from admixture events rather than population bottlenecks.Stellenbosch UniversityMRC Centre for Molecular and Cellular Biology and the DST/NRF Centre of Excellence for Biomedical TB ResearchCarnegie Corporation Grant and by the Department of Clinical Laboratory Sciences, University of Cape TownPublishers' versio
    • …
    corecore