236 research outputs found

    How Many Subpopulations is Too Many? Exponential Lower Bounds for Inferring Population Histories

    Full text link
    Reconstruction of population histories is a central problem in population genetics. Existing coalescent-based methods, like the seminal work of Li and Durbin (Nature, 2011), attempt to solve this problem using sequence data but have no rigorous guarantees. Determining the amount of data needed to correctly reconstruct population histories is a major challenge. Using a variety of tools from information theory, the theory of extremal polynomials, and approximation theory, we prove new sharp information-theoretic lower bounds on the problem of reconstructing population structure -- the history of multiple subpopulations that merge, split and change sizes over time. Our lower bounds are exponential in the number of subpopulations, even when reconstructing recent histories. We demonstrate the sharpness of our lower bounds by providing algorithms for distinguishing and learning population histories with matching dependence on the number of subpopulations. Along the way and of independent interest, we essentially determine the optimal number of samples needed to learn an exponential mixture distribution information-theoretically, proving the upper bound by analyzing natural (and efficient) algorithms for this problem.Comment: 38 pages, Appeared in RECOMB 201

    Genomic islands of speciation separate cichlid ecomorphs in an East African crater lake

    Get PDF
    The genomic causes and effects of divergent ecological selection during speciation are still poorly understood. Here, we report the discovery and detailed characterization of early-stage adaptive divergence of two cichlid fish ecomorphs in a small (700m diameter) isolated crater lake in Tanzania. The ecomorphs differ in depth preference, male breeding color, body shape, diet and trophic morphology. With whole genome sequences of 146 fish, we identify 98 clearly demarcated genomic ‘islands’ of high differentiation and demonstrate association of genotypes across these islands to divergent mate preferences. The islands contain candidate adaptive genes enriched for functions in sensory perception (including rhodopsin and other twilight vision associated genes), hormone signaling and morphogenesis. Our study suggests mechanisms and genomic regions that may play a role in the closely related mega-radiation of Lake Malawi.The work was funded by Royal Society-Leverhulme Trust Africa Awards AA100023 and AA130107 (M.J.G., B.P.N. and G.F.T.), a Wellcome Trust PhD studentship grant 097677/Z/11/Z (M.M.), Wellcome Trust grant WT098051 (S.S. and R.D.), Wellcome Trust and Cancer Research UK core support and a Wellcome Trust Senior Investigator Award (E.A.M.), a Leverhulme Trust Research Fellowship RF-2014-686 (M.J.G.), a University of Bristol Research Committee award (M.G.), a Bangor University Anniversary PhD studentship (to A.M.T.) and a Fisheries Society of the British Isles award (G.F.T.). Raw sequencing reads are in the SRA nucleotide archive: RAD sequencing (BioProject: PRJNA286304; accessions SAMN03768857 to SAMN03768912) and whole genome sequencing (BioProject PRJEB1254: sample accessions listed in Table S16). The RAD based phylogeny and alignments have been deposited in TreeBase (TB2:S18241). Whole genome variant calls in the VCF format, phylogenetic trees, and primer sequences for Sequenom genotyping are available from the Dryad Digital Repository (http://dx.doi.org/10.5061/dryad.770mc). RD declares his interests as a founder and non-executive director of Congenica Ltd., that he owns stock in Illumina from previous consulting, and is a scientific advisory board member of Dovetail Inc. We thank R. Schley for generating pharyngeal jaw data; S. Mzighani, J. Kihedu and staff of the Tanzanian Fisheries Research Institute for logistical support; A. Smith, H. Sungani, A. Shechonge, P. Parsons, J. Swanstrom, G. Cooke and J. Bridle for contributions to sampling and aquarium maintenance, the Sanger Institute sequencing core for DNA sequencing and Dr. H. Imai (Kyoto University) for the use of spectrometer in his laboratory.This is the author accepted manuscript. The final version is available from AAAS via http://dx.doi.org/10.1126/science.aac992

    Demes:A standard format for demographic models

    Get PDF
    Understanding the demographic history of populations is a key goal in population genetics, and with improving methods and data, ever more complex models are being proposed and tested. Demographic models of current interest typically consist of a set of discrete populations, their sizes and growth rates, and continuous and pulse migrations between those populations over a number of epochs, which can require dozens of parameters to fully describe. There is currently no standard format to define such models, significantly hampering progress in the field. In particular, the important task of translating the model descriptions in published work into input suitable for population genetic simulators is labor intensive and error prone. We propose the Demes data model and file format, built on widely used technologies, to alleviate these issues. Demes provide a well-defined and unambiguous model of populations and their properties that is straightforward to implement in software, and a text file format that is designed for simplicity and clarity. We provide thoroughly tested implementations of Demes parsers in multiple languages including Python and C, and showcase initial support in several simulators and inference methods. An introduction to the file format and a detailed specification are available at https://popsim-consortium.github.io/demes-spec-docs/

    High-coverage genome of the Tyrolean Iceman reveals unusually high Anatolian farmer ancestry

    Get PDF
    The Tyrolean Iceman is known as one of the oldest human glacier mummies, directly dated to 3350-3120 calibrated BCE. A previously published low-coverage genome provided novel insights into European prehistory, despite high present-day DNA contamination. Here, we generate a high-coverage genome with low contamination (15.3×) to gain further insights into the genetic history and phenotype of this individual. Contrary to previous studies, we found no detectable Steppe-related ancestry in the Iceman. Instead, he retained the highest Anatolian-farmer-related ancestry among contemporaneous European populations, indicating a rather isolated Alpine population with limited gene flow from hunter-gatherer-ancestry-related populations. Phenotypic analysis revealed that the Iceman likely had darker skin than present-day Europeans and carried risk alleles associated with male-pattern baldness, type 2 diabetes, and obesity-related metabolic syndrome. These results corroborate phenotypic observations of the preserved mummified body, such as high pigmentation of his skin and the absence of hair on his head

    Iron Age and Anglo-Saxon genomes from East England reveal British migration history

    Get PDF
    British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain

    Genomic islands of speciation separate cichlid ecomorphs in an East African crater lake.

    Get PDF
    The genomic causes and effects of divergent ecological selection during speciation are still poorly understood. Here we report the discovery and detailed characterization of early-stage adaptive divergence of two cichlid fish ecomorphs in a small (700 meters in diameter) isolated crater lake in Tanzania. The ecomorphs differ in depth preference, male breeding color, body shape, diet, and trophic morphology. With whole-genome sequences of 146 fish, we identified 98 clearly demarcated genomic "islands" of high differentiation and demonstrated the association of genotypes across these islands with divergent mate preferences. The islands contain candidate adaptive genes enriched for functions in sensory perception (including rhodopsin and other twilight-vision-associated genes), hormone signaling, and morphogenesis. Our study suggests mechanisms and genomic regions that may play a role in the closely related mega-radiation of Lake Malawi.The work was funded by Royal Society-Leverhulme Trust Africa Awards AA100023 and AA130107 (M.J.G., B.P.N. and G.F.T.), a Wellcome Trust PhD studentship grant 097677/Z/11/Z (M.M.), Wellcome Trust grant WT098051 (S.S. and R.D.), Wellcome Trust and Cancer Research UK core support and a Wellcome Trust Senior Investigator Award (E.A.M.), a Leverhulme Trust Research Fellowship RF-2014-686 (M.J.G.), a University of Bristol Research Committee award (M.G.), a Bangor University Anniversary PhD studentship (to A.M.T.) and a Fisheries Society of the British Isles award (G.F.T.). Raw sequencing reads are in the SRA nucleotide archive: RAD sequencing (BioProject: PRJNA286304; accessions SAMN03768857 to SAMN03768912) and whole genome sequencing (BioProject PRJEB1254: sample accessions listed in Table S16). The RAD based phylogeny and alignments have been deposited in TreeBase (TB2:S18241). Whole genome variant calls in the VCF format, phylogenetic trees, and primer sequences for Sequenom genotyping are available from the Dryad Digital Repository (http://dx.doi.org/10.5061/dryad.770mc). RD declares his interests as a founder and non-executive director of Congenica Ltd., that he owns stock in Illumina from previous consulting, and is a scientific advisory board member of Dovetail Inc. We thank R. Schley for generating pharyngeal jaw data; S. Mzighani, J. Kihedu and staff of the Tanzanian Fisheries Research Institute for logistical support; A. Smith, H. Sungani, A. Shechonge, P. Parsons, J. Swanstrom, G. Cooke and J. Bridle for contributions to sampling and aquarium maintenance, the Sanger Institute sequencing core for DNA sequencing and Dr. H. Imai (Kyoto University) for the use of spectrometer in his laboratory.This is the author accepted manuscript. The final version is available from AAAS via http://dx.doi.org/10.1126/science.aac992

    Ancient genomes reveal social and genetic structure of Late Neolithic Switzerland

    Get PDF
    Genetic studies of Neolithic and Bronze Age skeletons from Europe have provided evidence for strong population genetic changes at the beginning and the end of the Neolithic period. To further understand the implications of these in Southern Central Europe, we analyze 96 ancient genomes from Switzerland, Southern Germany, and the Alsace region in France, covering the Middle/Late Neolithic to Early Bronze Age. Similar to previously described genetic changes in other parts of Europe from the early 3rd millennium BCE, we detect an arrival of ancestry related to Late Neolithic pastoralists from the Pontic-Caspian steppe in Switzerland as early as 2860-2460 calBCE. Our analyses suggest that this genetic turnover was a complex process lasting almost 1000 years and involved highly genetically structured populations in this region

    Genomic and dietary transitions during the Mesolithic and Early Neolithic in Sicily

    Get PDF
    Southern Italy is a key region for understanding the agricultural transition in the Mediterranean due to its central position. We present a genomic transect for 19 prehistoric Sicilians that covers the Early Mesolithic to Early Neolithic period. We find that the Early Mesolithic hunter-gatherers (HGs) are a highly drifted sister lineage to Early Holocene western European HGs, whereas a quarter of the Late Mesolithic HGs ancestry is related to HGs from eastern Europe and the Near East. This indicates substantial gene flow from (south-)eastern Europe between the Early and Late Mesolithic. The Early Neolithic farmers are genetically most similar to those from the Balkan and Greece, and carry only a maximum of ~7% ancestry from Sicilian Mesolithic HGs. Ancestry changes match changes in dietary profile and material culture, except for two individuals who may provide tentative initial evidence that HGs adopted elements of farming in Sicily

    Genome-wide data from medieval German Jews show that the Ashkenazi founder event pre-dated the 14th century

    Get PDF
    We report genome-wide data for 33 Ashkenazi Jews (AJ), dated to the 14th century, following a salvageexcavation at the medieval Jewish cemetery of Erfurt, Germany. The Erfurt individuals are geneticallysimilar to modern AJ and have substantial Southern European ancestry, but they show more variabilityin Eastern European-related ancestry than modern AJ. A third of the Erfurt individuals carried the samenearly-AJ-specific mitochondrial haplogroup and eight carried pathogenic variants known to affect AJtoday. These observations, together with high levels of runs of homozygosity, suggest that the Erfurtcommunity had already experienced the major reduction in size that affected modern AJ. However, theErfurt bottleneck was more severe, implying substructure in medieval AJ. Together, our results suggestthat the AJ founder event and the acquisition of the main sources of ancestry pre-dated the 14th centuryand highlight late medieval genetic heterogeneity no longer present in modern AJ

    Dynamic changes in genomic and social structures in third millennium BCE central Europe

    Get PDF
    Europe’s prehistory oversaw dynamic and complex interactions of diverse societies, hitherto unexplored at detailed regional scales. Studying 271 human genomes dated ~4900 to 1600 BCE from the European heartland, Bohemia, we reveal unprecedented genetic changes and social processes. Major migrations preceded the arrival of “steppe” ancestry, and at ~2800 BCE, three genetically and culturally differentiated groups coexisted. Corded Ware appeared by 2900 BCE, were initially genetically diverse, did not derive all steppe ancestry from known Yamnaya, and assimilated females of diverse backgrounds. Both Corded Ware and Bell Beaker groups underwent dynamic changes, involving sharp reductions and complete replacements of Y-chromosomal diversity at ~2600 and ~2400 BCE, respectively, the latter accompanied by increased Neolithic-like ancestry. The Bronze Age saw new social organization emerge amid a ≥40% population turnover.Introduction Results - General sample overview - Bohemia before Corded Ware (pre-CW, before ~2800 BCE) - Corded Ware - Bell Beaker - EBA—Únětice culture Discussion Materials and methods - Processing sites for the newly reported individuals - Sampling - DNA extraction - DNA libraries and in-solution capture - Sequencing - Sex determination and authentication - Genotyping - Mitochondrial and Y chromosome haplogroups - Principal components analysis - Ancestry decomposition and admixture modeling - Y haplogroup frequency simulation
    corecore