7 research outputs found

    Efficient ancestry and mutation simulation with msprime 1.0

    Get PDF
    Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement

    Genetic Characterization of Indigenous Peoples from Oaxaca, Mexico, and Its Relation to Linguistic and Geographic Isolation

    No full text
    We used 15 short tandem repeat (STR) loci (D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, VWA, TPOX, D18S51, D5S818, and FGA) to genetically characterize 361 individuals from 11 indigenous populations (Amuzgo, Chinanteco, Chontal, Huave, Mazateco, Mixe, Mixteco, Triqui, Zapoteco del Istmo, Zapoteco del Valle, and Zoque) from Oaxaca, Mexico. We also used previously published data from other Mexican peoples (Maya, Chol, Tepehua, Otomí, and Mestizos from northern and central Mexico) to delineate genetic relations, for a total of 541 individuals. Average heterozygosity (H) was lower in most populations from Oaxaca (range 0.687 in Zoque to 0.756 in Chontal) than values observed in Mestizo populations from Mexico (0.758 and 0.793 in central and northern Mestizo, respectively) but higher than values observed in other Amerindian populations from South America; the same relation was true for the number of alleles (na). We tested (using the software Structure) whether major geographic or linguistic barriers to gene flow existed among the populations of Oaxaca and found that the populations appeared to constitute one or two genetic groups, suggesting that neither geographic location nor linguistics had an effect on the genetic structure of these culturally and linguistically highly diverse indigenous peoples. Moreover, we found a low but statistically significant between-population differentiation. In addition, the genetic structure of Oaxacan populations did not fit an isolation-by-distance model. Finally, using AMOVA and a Bayesian clustering approach, we did not detect significant geographic or linguistic barriers to gene flow within Oaxaca. These results suggest that the indigenous communities of Oaxaca, although culturally isolated, can be genetically defined as a large, nearly panmictic population in which migration could be a more important population mechanism than genetic drift. Finally, compared with outgroups in Mexico (both indigenous peoples and Mestizos), three groups were apparent. Among them, only the Otomí population from Hidalgo has a different culture and language. Pay-Per-View Download To access this article as a PDF pay-per-view download via BioOne, please click here

    Imputation performance in Latin American populations: improving rare variants representation with the inclusion of native American genomes

    No full text
    Current Genome-Wide Association Studies (GWAS) rely on genotype imputation to increase statistical power, improve fine-mapping of association signals, and facilitate meta-analyses. Due to the complex demographic history of Latin America and the lack of balanced representation of Native American genomes in current imputation panels, the discovery of locally relevant disease variants is likely to be missed, limiting the scope and impact of biomedical research in these populations. Therefore, the necessity of better diversity representation in genomic databases is a scientific imperative. Here, we expand the 1,000 Genomes reference panel (1KGP) with 134 Native American genomes (1KGP + NAT) to assess imputation performance in Latin American individuals of mixed ancestry. Our panel increased the number of SNPs above the GWAS quality threshold, thus improving statistical power for association studies in the region. It also increased imputation accuracy, particularly in low-frequency variants segregating in Native American ancestry tracts. The improvement is subtle but consistent across countries and proportional to the number of genomes added from local source populations. To project the potential improvement with a higher number of reference genomes, we performed simulations and found that at least 3,000 Native American genomes are needed to equal the imputation performance of variants in European ancestry tracts. This reflects the concerning imbalance of diversity in current references and highlights the contribution of our work to reducing it while complementing efforts to improve global equity in genomic research.Published versionThis work was supported by “The Mexican Biobank Project: Building Capacity for Big Data Science in Medical Genomics in Admixed Populations”, a binational initiative between Mexico and the UK co-funded by CONACYT (Grant number FONCICYT/50/ 2016), and The Newton Fund through The Medical Research Council (Grant number MR/N028937/1) awarded to AME and AVSH. It was also supported by the International Center for Genetic Engineering and Biotechnology (ICGEB, Italy) grant number CRP/MEX20-01. MS was partially supported by the Chicago Fellows program of the University of Chicago. DODV is supported by the UC MEXUS CONACYT collaborative program (Grant number CN-19-29), and the UNAM PAPIIT funding program (Grant number IA200620)

    A second update on mapping the human genetic architecture of COVID-19

    Get PDF
    corecore