72 research outputs found

    NgsRelate:a software tool for estimating pairwise relatedness from next-generation sequencing data

    Get PDF
    Motivation: Pairwise relatedness estimation is important in many contexts such as disease mapping and population genetics. However, all existing estimation methods are based on called genotypes, which is not ideal for next-generation sequencing (NGS) data of low depth from which genotypes cannot be called with high certainty. Results: We present a software tool, NgsRelate, for estimating pairwise relatedness from NGS data. It provides maximum likelihood estimates that are based on genotype likelihoods instead of genotypes and thereby takes the inherent uncertainty of the genotypes into account. Using both simulated and real data, we show that NgsRelate provides markedly better estimates for low-depth NGS data than two state-of-the-art genotype-based methods. Availability: NgsRelate is implemented in C++ and is available under the GNU license at www.popgen.dk/software. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    ANGSD:analysis of next generation sequencing data

    Get PDF
    BACKGROUND: High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously. RESULTS: We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods. CONCLUSIONS: The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd. The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0356-4) contains supplementary material, which is available to authorized users

    Identifying a living great-grandson of the Lakota Sioux leader Tatanka Iyotake (Sitting Bull).

    Get PDF
    A great-grandson of the legendary Lakota Sioux leader Sitting Bull (Tatanka Iyotake), Ernie LaPointe, wished to have their familial relationship confirmed via genetic analysis, in part, to help settle concerns over Sitting Bull’s final resting place. To address Ernie LaPointe’s claim of family relationship, we obtained minor amounts of genomic data from a small piece of hair from Sitting Bull’s scalp lock, which was repatriated in 2007. We then compared these data to genome-wide data from LaPointe and other Lakota Sioux using a new probabilistic approach and concluded that Ernie LaPointe is Sitting Bull’s great-grandson. To our knowledge, this is the first published example of a familial relationship between contemporary and a historical individual that has been confirmed using such limited amounts of ancient DNA across such distant relatives. Hence, this study opens the possibility for broadening genealogical research, even when only minor amounts of ancient genetic material are accessible

    Uncovering the genetic history of the present-day greenlandic population

    Get PDF
    Because of past limitations in samples and genotyping technologies, important questions about the history of the present-day Greenlandic population remain unanswered. In an effort to answer these questions and in general investigate the genetic history of the Greenlandic population, we analyzed ∼200,000 SNPs from more than 10% of the adult Greenlandic population (n = 4,674). We found that recent gene flow from Europe has had a substantial impact on the population: more than 80% of the Greenlanders have some European ancestry (on average ∼25% of their genome). However, we also found that the amount of recent European gene flow varies across Greenland and is far smaller in the more historically isolated areas in the north and east and in the small villages in the south. Furthermore, we found that there is substantial population structure in the Inuit genetic component of the Greenlanders and that individuals from the east, west, and north can be distinguished from each other. Moreover, the genetic differences in the Inuit ancestry are consistent with a single colonization wave of the island from north to west to south to east. Although it has been speculated that there has been historical admixture between the Norse Vikings who lived in Greenland for a limited period ∼600–1,000 years ago and the Inuit, we found no evidence supporting this hypothesis. Similarly, we found no evidence supporting a previously hypothesized admixture event between the Inuit in East Greenland and the Dorset people, who lived in Greenland before the Inuit

    The ancestry and affiliations of Kennewick Man

    Get PDF
    Kennewick Man, referred to as the Ancient One by Native Americans, is a male human skeleton discovered in Washington state (USA) in 1996 and initially radiocarbon dated to 8,340-9,200 calibrated years before present (BP). His population affinities have been the subject of scientific debate and legal controversy. Based on an initial study of cranial morphology it was asserted that Kennewick Man was neither Native American nor closely related to the claimant Plateau tribes of the Pacific Northwest, who claimed ancestral relationship and requested repatriation under the Native American Graves Protection and Repatriation Act (NAGPRA). The morphological analysis was important to judicial decisions that Kennewick Man was not Native American and that therefore NAGPRA did not apply. Instead of repatriation, additional studies of the remains were permitted. Subsequent craniometric analysis affirmed Kennewick Man to be more closely related to circumpacific groups such as the Ainu and Polynesians than he is to modern Native Americans. In order to resolve Kennewick Man's ancestry and affiliations, we have sequenced his genome to ∼1× coverage and compared it to worldwide genomic data including for the Ainu and Polynesians. We find that Kennewick Man is closer to modern Native Americans than to any other population worldwide. Among the Native American groups for whom genome-wide data are available for comparison, several seem to be descended from a population closely related to that of Kennewick Man, including the Confederated Tribes of the Colville Reservation (Colville), one of the five tribes claiming Kennewick Man. We revisit the cranial analyses and find that, as opposed to genome-wide comparisons, it is not possible on that basis to affiliate Kennewick Man to specific contemporary groups. We therefore conclude based on genetic comparisons that Kennewick Man shows continuity with Native North Americans over at least the last eight millennia

    Calculation of Tajima’s D and other neutrality test statistics from low depth next-generation sequencing data

    Get PDF
    BACKGROUND: A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima’s D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. However, estimates of frequency spectra from NGS data are strongly affected by low sequencing coverage; the inherent technology dependent variation in sequencing depth causes systematic differences in the value of the statistic among genomic regions. RESULTS: We have developed an approach that accommodates the uncertainty of the data when calculating site frequency based neutrality test statistics. A salient feature of this approach is that it implicitly solves the problems of varying sequencing depth, missing data and avoids the need to infer variable sites for the analysis and thereby avoids ascertainment problems introduced by a SNP discovery process. CONCLUSION: Using an empirical Bayes approach for fast computations, we show that this method produces results for low-coverage NGS data comparable to those achieved when the genotypes are known without uncertainty. We also validate the method in an analysis of data from the 1000 genomes project. The method is implemented in a fast framework which enables researchers to perform these neutrality tests on a genome-wide scale
    corecore