17 research outputs found
Genomic analyses inform on migration events during the peopling of Eurasia.
High-coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago.Support was provided by: Estonian Research Infrastructure Roadmap grant no 3.2.0304.11-0312; Australian Research Council Discovery grants (DP110102635 and DP140101405) (D.M.L., M.W. and E.W.); Danish National Research Foundation; the Lundbeck Foundation and KU2016 (E.W.); ERC Starting Investigator grant (FP7 - 261213) (T.K.); Estonian Research Council grant PUT766 (G.C. and M.K.); EU European Regional Development Fund through the Centre of Excellence in Genomics to Estonian Biocentre (R.V.; M.Me. and A.Me.), and Centre of Excellence for Genomics and Translational Medicine Project No. 2014-2020.4.01.15-0012 to EGC of UT (A.Me.) and EBC (M.Me.); Estonian Institutional Research grant IUT24-1 (L.S., M.J., A.K., B.Y., K.T., C.B.M., Le.S., H.Sa., S.L., D.M.B., E.M., R.V., G.H., M.K., G.C., T.K. and M.Me.) and IUT20-60 (A.Me.); French Ministry of Foreign and European Affairs and French ANR grant number ANR-14-CE31-0013-01 (F.-X.R.); Gates Cambridge Trust Funding (E.J.); ICG SB RAS (No. VI.58.1.1) (D.V.L.); Leverhulme Programme grant no. RP2011-R-045 (A.B.M., P.G. and M.G.T.); Ministry of Education and Science of Russia; Project 6.656.2014/K (S.A.F.); NEFREX grant funded by the European Union (People Marie Curie Actions; International Research Staff Exchange Scheme; call FP7-PEOPLE-2012-IRSES-number 318979) (M.Me., G.H. and M.K.); NIH grants 5DP1ES022577 05, 1R01DK104339-01, and 1R01GM113657-01 (S.Tis.); Russian Foundation for Basic Research (grant N 14-06-00180a) (M.G.); Russian Foundation for Basic Research; grant 16-04-00890 (O.B. and E.B); Russian Science Foundation grant 14-14-00827 (O.B.); The Russian Foundation for Basic Research (14-04-00725-a), The Russian Humanitarian Scientific Foundation (13-11-02014) and the Program of the Basic Research of the RAS Presidium “Biological diversity” (E.K.K.); Wellcome Trust and Royal Society grant WT104125AIA & the Bristol Advanced Computing Research Centre (http://www.bris.ac.uk/acrc/) (D.J.L.); Wellcome Trust grant 098051 (Q.A.; C.T.-S. and Y.X.); Wellcome Trust Senior Research Fellowship grant 100719/Z/12/Z (M.G.T.); Young Explorers Grant from the National Geographic Society (8900-11) (C.A.E.); ERC Consolidator Grant 647787 ‘LocalAdaptatio’ (A.Ma.); Program of the RAS Presidium “Basic research for the development of the Russian Arctic” (B.M.); Russian Foundation for Basic Research grant 16-06-00303 (E.B.); a Rutherford Fellowship (RDF-10-MAU-001) from the Royal Society of New Zealand (M.P.C.)
Origins of East Caucasus Gene Pool: Contributions of Autochthonous Bronze Age Populations and Migrations from West Asia Estimated from Y-Chromosome Data
The gene pool of the East Caucasus, encompassing modern-day Azerbaijan and Dagestan populations, was studied alongside adjacent populations using 83 Y-chromosome SNP markers. The analysis of genetic distances among 18 populations (N = 2216) representing Nakh-Dagestani, Altaic, and Indo-European language families revealed the presence of three components (Steppe, Iranian, and Dagestani) that emerged in different historical periods. The Steppe component occurs only in Karanogais, indicating a recent medieval migration of Turkic-speaking nomads from the Eurasian steppe. The Iranian component is observed in Azerbaijanis, Dagestani Tabasarans, and all Iranian-speaking peoples of the Caucasus. The Dagestani component predominates in Dagestani-speaking populations, except for Tabasarans, and in Turkic-speaking Kumyks. Each component is associated with distinct Y-chromosome haplogroup complexes: the Steppe includes C-M217, N-LLY22g, R1b-M73, and R1a-M198; the Iranian includes J2-M172(×M67, M12) and R1b-M269; the Dagestani includes J1-Y3495 lineages. We propose J1-Y3495 haplogroup’s most common lineage originated in an autochthonous ancestral population in central Dagestan and splits up ~6 kya into J1-ZS3114 (Dargins, Laks, Lezgi-speaking populations) and J1-CTS1460 (Avar-Andi-Tsez linguistic group). Based on the archeological finds and DNA data, the analysis of J1-Y3495 phylogeography suggests the growth of the population in the territory of modern-day Dagestan that started in the Bronze Age, its further dispersal, and the microevolution of the diverged population
Deep Phylogenetic Analysis of Haplogroup G1 Provides Estimates of SNP and STR Mutation Rates on the Human Y-Chromosome and Reveals Migrations of Iranic Speakers
<div><p>Y-chromosomal haplogroup G1 is a minor component of the overall gene pool of South-West and Central Asia but reaches up to 80% frequency in some populations scattered within this area. We have genotyped the G1-defining marker M285 in 27 Eurasian populations (n= 5,346), analyzed 367 M285-positive samples using 17 Y-STRs, and sequenced ~11 Mb of the Y-chromosome in 20 of these samples to an average coverage of 67X. This allowed detailed phylogenetic reconstruction. We identified five branches, all with high geographical specificity: G1-L1323 in Kazakhs, the closely related G1-GG1 in Mongols, G1-GG265 in Armenians and its distant brother clade G1-GG162 in Bashkirs, and G1-GG362 in West Indians. The haplotype diversity, which decreased from West Iran to Central Asia, allows us to hypothesize that this rare haplogroup could have been carried by the expansion of Iranic speakers northwards to the Eurasian steppe and via founder effects became a predominant genetic component of some populations, including the Argyn tribe of the Kazakhs. The remarkable agreement between genetic and genealogical trees of Argyns allowed us to calibrate the molecular clock using a historical date (1405 AD) of the most recent common genealogical ancestor. The mutation rate for Y-chromosomal sequence data obtained was 0.78×10<sup>-9</sup> per bp per year, falling within the range of published rates. The mutation rate for Y-chromosomal STRs was 0.0022 per locus per generation, very close to the so-called genealogical rate. The “clan-based” approach to estimating the mutation rate provides a third, middle way between direct farther-to-son comparisons and using archeologically known migrations, whose dates are subject to revision and of uncertain relationship to genetic events.</p></div
Parallel evolution of genes and languages in the Caucasus region
We analyzed 40 single nucleotide polymorphism and 19 short tandem repeat Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation, and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene–language coevolution occurred within geographically isolated populations, probably due to its mountainous terrain.Oleg Balanovsky, Khadizhat Dibirova, Anna Dybo, Oleg Mudrak, Svetlana Frolova, Elvira Pocheshkhova, Marc Haber, Daniel Platt, Theodore Schurr, Wolfgang Haak, Marina Kuznetsova, Magomed Radzhabov, Olga Balaganskaya, Alexey Romanov, Tatiana Zakharova, David F. Soria Hernanz, Pierre Zalloua, Sergey Koshel, Merritt Ruhlen, Colin Renfrew, R. Spencer Wells, Chris Tyler-Smith, Elena Balanovska and The Genographic Consortiu
Haplotype diversity of haplogroup G1-M285 in South-Western and Central Asian populations.
<p>N—number of G1 samples genotyped by 17 Y-STRs;</p><p>N<sub>HT</sub>—number of different Y-chromosomal STR haplotypes;</p><p>F<sub>MAX</sub>—frequency of the most frequent haplotype;</p><p>HD—haplotype diversity; the populations were sorted according to the level of HD.</p><p>Haplotype diversity of haplogroup G1-M285 in South-Western and Central Asian populations.</p
Genetic and genealogical reconstructions of the relationship between members of the Argyn tribe of the Kazakh:
<p>A) Genetic tree reconstructed from Y-chromosome sequences of the Kazakh samples. B) Genealogical tree of the Argyn tribe of the Kazakh. Each sequenced Kazakh sample is attributed to the clan it originates from. The genealogical ancestor with the known historical date is marked in grey.</p
Network of Y-STR haplotypes within haplogroup G1.
<p>Arrows mark samples chosen for Y-chromosomal sequencing.</p
Y-chromosome haplogroup G1 phylogeny.
<p>The tree combines the high-coverage dataset reported in this study with data from 1000 Genomes Project. Dotted lines indicate the approximate phylogenetic position of two previously reported G1 branches which were absent among our samples.</p
Ancient migrations of Iranic-speaking populations.
<p>A) Area populated by Iranic speakers in the middle of the first millennium BC. States whose languages belonged to the Iranic and Armenian linguistic groups are shown in red (modified from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0122968#pone.0122968.ref039" target="_blank">39</a>]). B) Homeland and migration of Iranic speakers according to the major competing theories (modified from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0122968#pone.0122968.ref034" target="_blank">34</a>]).</p