14 research outputs found

    Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations

    Get PDF
    Identifying the ancestry of chromosomal segments of distinct ancestry has a wide range of applications from disease mapping to learning about history. Most methods require the use of unlinked markers; but, using all markers from genome-wide scanning arrays, it should in principle be possible to infer the ancestry of even very small segments with exquisite accuracy. We describe a method, HAPMIX, which employs an explicit population genetic model to perform such local ancestry inference based on fine-scale variation data. We show that HAPMIX outperforms other methods, and we explore its utility for inferring ancestry, learning about ancestral populations, and inferring dates of admixture. We validate the method empirically by applying it to populations that have experienced recent and ancient admixture: 935 African Americans from the United States and 29 Mozabites from North Africa. HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone

    Population Genetics in the Genomic Era

    Get PDF

    Identification of breed contributions in crossbred dogs

    Get PDF
    There has been a strong public interest recently in the interrogation of canine ancestries using direct-toconsumer (DTC) genetic ancestry inference tools. Our goal is to improve the accuracy of the associated computational tools, by developing superior algorithms for identifying the breed composition of mixedbreed dogs. Genetic test data has been provided by Mars Veterinary, using SNP markers. We approach this ancestry inference problem from two main directions. The first approach is optimized for datasets composed of a small number of ancestry informative markers (AIM). Firstly, we compute haplotype frequencies from purebred ancestral panels which characterize genetic variation within breeds and are utilized to predict breed compositions. Due to a large number of possible breed combinations in admixed dogs we approximately sample this search space with a Metropolis-Hastings algorithm. As proposal density we either uniformly sample new breeds for the lineage, or we bias the Markov Chain so that breeds in the lineage are more likely to be replaced by similar breeds. The second direction we explore is dominated by HMM approaches which view genotypes as realizations of latent variable sequences corresponding to breeds. In this approach an admixed canine sample is viewed as a linear combination of segments from dogs in the ancestral panel. Results were evaluated using two different performance measures. Firstly, we looked at a generalization of binary ROC-curves to multi-class classification problems. Secondly, to more accurately judge breed contribution approximations we computed the difference between expected and predicted breed contributions. Experimental results on a synthetic, admixed test dataset using AIMs showed that the MCMC approach successfully predicts breed proportions for a variety of lineage complexities. Furthermore, due to exploration in the MCMC algorithm true breed contributions are underestimated. The HMM approach performed less well which is presumably due to using less information of the dataset

    European admixture on the Micronesian island of Kosrae: lessons from complete genetic information

    Get PDF
    The architecture of natural variation present in a contemporary population is a result of multiple population genetic forces, including population bottleneck and expansion, selection, drift, and admixture. We seek to untangle the contribution of admixture to genetic diversity on the Micronesian island of Kosrae. Toward this goal, we used a complete genetic approach by combining a dense genome-wide map of 100 000 single-nucleotide polymorphisms (SNPs) with data from uniparental markers from the mitochondrial genome and the nonrecombining portion of the Y chromosome. These markers were typed in ∼3200 individuals from Kosrae, representing 80% of the adult population of the island. We developed novel software that uses SNP data to delineate ancestry for individual segments of the genome. Through this analysis, we determined that 39% of Kosraens have some European ancestry. However, the vast majority of admixed individuals (77%) have European alleles spanning less than 10% of their genomes. Data from uniparental markers show most of this admixture to be male, introduced in the late nineteenth century. Furthermore, pedigree analysis shows that the majority of European admixture on Kosrae is because of the contribution of one individual. This approach shows the benefit of combining information from autosomal and uniparental polymorphisms and provides new methodology for determining ancestry in a population

    Induse jõe oru inimeste, parside, India juutide ja Tharu hõimu geneetilise põlvnemise piiritlemine

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneKäesolev on viies Tartu Ülikoolis valminud väitekiri Lõuna-Aasia rahvaste geneetilisest ajaloost. Asustatud kaasaegse inimese poolt märksa enne viimase jääaja maksimumi, elab tänapäeval selles regioonis üle 1.8 miljardi inimese – pea veerand inimkonnast. Seega ei ole võimalik süvitsi mõista kaasaegse inimese geneetise varieeruvuse kujunemist, sh eriti väljapool Sahara-alust Aafrikat, omamata detailsemat teadmist Lõuna-Aasia rahvaste geneetikast Väitekiri põhineb neljal ilmunud artiklil. Neist esimeses uurisime Kirde-Indiat asustavaid rahvaid seoses võimaliku pärinevusega Induse oru kultuurist ja järgnenud vedade ajastust. Teine ja kolmas artikkel on pühendatud migratsioonidele, mis tõid Indiasse religioosses mõttes uusi rahvagruppe: parsid Iraanist alates 7. sajandi lõpupoolelt ja juudid, kelle saabumine Indiasse on toimunud mitme lainena. Neljandas artiklis on vaatluse all Nepaalis, kuid ka India põhjapoolsetes osariikides elutsev rahvarohke tharu hõim. Esimes artikli huvitavamaks leiuks on usutavasti juba vedade ajastust tuntud Rori populatsiooni genoomis väljenduv suurem geneetiline afiinsus põhjapoolse stepivööndi rahvastega, samuti ka lääne-eurooplastega, mis räägib põhja-lõunasuunalistest migratsiooni(de)st eelajaloolisel ajal. Parside saabumist Lõuna-Aasiasse seostatakse Iraani islamiseerumisega 7. sajandil. Võrreldes parside genoome nende ajaloolises kontekstis leidsime ulatusliku segunemise Lõuna-Aasia rahvastega, sealjuures asümeetriliselt isa ja emaliinides. Sama saab väita ka Indias judaistliku traditsiooni elemente säilitanud erinevate kogukondade kohta, kelle genoomis on siiski selgelt säilunud Lähis- ja Kesk-Ida pärandit. Puudutavalt aga geneetiliselt ulatuslikult varieeruvat tharu hõimu, kelle hulgas on selgesti eristatav ka Ida-Aasia komponent, segunenuna Lõuna-Aasia pärandiga, paistab õigustatud olevat neid vaadelda esmajoones mitte sedavõrd deemilise, kuivõrd just kultuurilise konstruktsioonina.Presented hereby is the 5th in a series of PhD theses prepared in Tartu University, addressing genetics of population history of the South Asian peoples. Inhabited considerably before the Last Glacial Maximum, the region harbors by now about 1.8 billion humans – almost a quarter of the global population. Therefore, understanding of present-day variation of the latter, in particular outside sub-Saharan Africa, is not possible without deeper knowledge about genetics of South Asian populations. This thesis is based on four published papers. The first one is focused on selected populations inhabiting northeastern Indus Valley, bearing, in particular, in mind ancient Indus Valley civilization and following it Vedic period. The second and the third paper address historically somewhat better known migrations, bringing to India religiously distinct Parsi and Jewish peoples. The fourth paper analyses the genetic variation of a populous Tharu tribe, living predominantly in Nepal, but also in northern provinces of India. Perhaps the most interesting finding of the first paper is that the presumably identified already in Vedic texts, Ror population exhibits significant genetic affinity with northern Steppe and West European peoples, testifying about prehistoric north to south migration(s). The arrival of Parsis to South Asia in 7th century was a consequence of the Islamization of Iran. Comparing Parsi genomes in their historic contexts, we observed their extensive admixture with South Asians, in particular, asymmetrically in paternal and maternal lineages. Nearly the same can be said about different Indian communities that preserved Judaist traditions: their genomes show affinities to peoples living in the Near and Middle East. As far as the genetically highly diverse Tharu tribe is concerned, a clearly distinct East Asian contribution can be seen, admixed with South Asian genetic heritage. It seems justified to identify the Tharu as cultural, rather than demic phenomenon.https://www.ester.ee/record=b542949

    Genome-Wide Local Ancestry Approach Identifies Genes and Variants Associated with Chemotherapeutic Susceptibility in African Americans

    Get PDF
    Chemotherapeutic agents are used in the treatment of many cancers, yet variable resistance and toxicities among individuals limit successful outcomes. Several studies have indicated outcome differences associated with ancestry among patients with various cancer types. Using both traditional SNP-based and newly developed gene-based genome-wide approaches, we investigated the genetics of chemotherapeutic susceptibility in lymphoblastoid cell lines derived from 83 African Americans, a population for which there is a disparity in the number of genome-wide studies performed. To account for population structure in this admixed population, we incorporated local ancestry information into our association model. We tested over 2 million SNPs and identified 325, 176, 240, and 190 SNPs that were suggestively associated with cytarabine-, 5′-deoxyfluorouridine (5′-DFUR)-, carboplatin-, and cisplatin-induced cytotoxicity, respectively (p≤10−4). Importantly, some of these variants are found only in populations of African descent. We also show that cisplatin-susceptibility SNPs are enriched for carboplatin-susceptibility SNPs. Using a gene-based genome-wide association approach, we identified 26, 11, 20, and 41 suggestive candidate genes for association with cytarabine-, 5′-DFUR-, carboplatin-, and cisplatin-induced cytotoxicity, respectively (p≤10−3). Fourteen of these genes showed evidence of association with their respective chemotherapeutic phenotypes in the Yoruba from Ibadan, Nigeria (p<0.05), including TP53I11, COPS5 and GAS8, which are known to be involved in tumorigenesis. Although our results require further study, we have identified variants and genes associated with chemotherapeutic susceptibility in African Americans by using an approach that incorporates local ancestry information

    Mapping genes underlying ethnic differences in tuberculosis risk by linkage disequilibrium in the South African coloured population of the Western Cape

    Get PDF
    Includes bibliographical references.The South Africa Coloured population of the Western Cape is the result of unions between Europeans, Africans (Bantu and Khoisan), and various other populations (Malaysian or Indonesian descent). The world-wide burden of tuberculosis remains an enormous problem, and is particularly severe in this population. In general, admixed populations that have arisen in historical times can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Despite numerous successful genome-wide association studies, detecting variants that have low disease risk still poses a challenge. Furthermore, admixture association studies for multi-way admixed populations pose constant challenges, including the choice of an accurate ancestral panel to infer ancestry and for imputing missing genotypes to identify possible genetic variants causing susceptibility to disease. This thesis addresses some of these challenges. We first developed PROXYANC, an approach to select the best proxy ancestral populations for admixed populations. From the simulation of a multi-way admixed population, we demonstrated the ability and accuracy of PROXYANC in selecting the best proxy ancestry and illustrated the importance of the choice of ancestries in both estimating admixture proportions and imputing missing genotypes. We applied this approach to the South African Coloured population, to refine both the choice of ancestral populations and their genetic contributions. We also demonstrated that the ancestral allele frequency differences correlated with increased linkage disequilibrium in the SAC, and that the increased LD originates from admixture events rather than population bottlenecks. Secondly, we conducted a study to determine whether ancestry-specific genetic contributions affect tuberculosis risk. We additionally conducted imputation genome-wide association studies and a meta-analysis incorporating previous genome-wide association studies of tuberculosis

    Imputation-based Genetic Association Analysis of Complex Traits in Admixed Populations

    Get PDF
    Genetic association studies in admixed populations have drawn increasing attention from the genetic community, as performing association analysis in diverse populations allows us to gain deeper understanding of the genetic architecture of human diseases and traits. However, population stratification due to admixture poses special challenges. To address the challenges, I conducted the following studies from the perspectives of enhancing genotype imputation quality and providing proper treatment of local ancestry in the association analysis. First, I provided a new resource of marker imputability information with commonly used reference panels to guide the choice of reference and genotyping platforms. To be specific, I systematically evaluated marker imputation quality using sequencing-based reference panels from the 1000 Genomes Project and released the information through a user-friendly and publicly available data portal. This is the first resource providing variant imputability information specific to each continental group and to each genotyping platform. Second, I established a paradigm for better imputation in African Americans using study-specific sequencing based reference panels. I built an internal reference panel consisting of variants derived from the NHLBI Exome Sequencing Project for African American subjects, which significantly increased effective sample size comparing with that from the 1000 Genomes Project. No loss of imputation quality was observed using a panel built from phenotypic extremes. In addition, I recommended using haplotypes from Exome Sequencing Project alone or concatenation of the two panels over quality score-based post-imputation selection or IMPUTE2’s two-panel combination. Finally, I proposed a robust and powerful two-step testing procedure for association analysis in admixed populations. Through extensive numeric simulations, I demonstrated that our testing procedure robustly captures and pinpoints associations due to allele effect, ancestry effect or the existence of effect heterogeneity between the two ancestral populations. In particular, our testing procedure is more powerful in identifying the presence of effect heterogeneity than traditional cross-product interaction model. I further illustrated its usefulness by applying the two-step testing procedure to test for the association between genetic variants and hemoglobin trait in African American participates from CARe. Taken together, the above studies guide genotype imputation practice and substantially improve the power of imputation-based genetic association studies in admixed populations, leading to more accurate discovery of disease-associated variants and ultimately better therapeutic strategies in admixed populations.Doctor of Philosoph
    corecore