79 research outputs found

    The geography of recent genetic ancestry across Europe

    Get PDF
    The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of recent genealogical ancestry over the past three thousand years at a continental scale. We detected 1.9 million shared genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 10-50 genetic common ancestors from the last 1500 years, and upwards of 500 genetic ancestors from the previous 1000 years. These numbers drop off exponentially with geographic distance, but since genetic ancestry is rare, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1000 years. There is substantial regional variation in the number of shared genetic ancestors: especially high numbers of common ancestors between many eastern populations likely date to the Slavic and/or Hunnic expansions, while much lower levels of common ancestry in the Italian and Iberian peninsulas may indicate weaker demographic effects of Germanic expansions into these areas and/or more stably structured populations. Recent shared ancestry in modern Europeans is ubiquitous, and clearly shows the impact of both small-scale migration and large historical events. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.Comment: Full size figures available from http://www.eve.ucdavis.edu/~plralph/research.html; or html version at http://ralphlab.usc.edu/ibd/ibd-paper/ibd-writeup.xhtm

    Genomic Runs of Homozygosity Record Population History and Consanguinity

    Get PDF
    The human genome is characterised by many runs of homozygous genotypes, where identical haplotypes were inherited from each parent. The length of each run is determined partly by the number of generations since the common ancestor: offspring of cousin marriages have long runs of homozygosity (ROH), while the numerous shorter tracts relate to shared ancestry tens and hundreds of generations ago. Human populations have experienced a wide range of demographic histories and hold diverse cultural attitudes to consanguinity. In a global population dataset, genome-wide analysis of long and shorter ROH allows categorisation of the mainly indigenous populations sampled here into four major groups in which the majority of the population are inferred to have: (a) recent parental relatedness (south and west Asians); (b) shared parental ancestry arising hundreds to thousands of years ago through long term isolation and restricted effective population size (N(e)), but little recent inbreeding (Oceanians); (c) both ancient and recent parental relatedness (Native Americans); and (d) only the background level of shared ancestry relating to continental N(e) (predominantly urban Europeans and East Asians; lowest of all in sub-Saharan African agriculturalists), and the occasional cryptically inbred individual. Moreover, individuals can be positioned along axes representing this demographic historic space. Long runs of homozygosity are therefore a globally widespread and under-appreciated characteristic of our genomes, which record past consanguinity and population isolation and provide a distinctive record of the demographic history of an individual's ancestors. Individual ROH measures will also allow quantification of the disease risk arising from polygenic recessive effects

    The DCDC2 deletion is not a risk factor for dyslexia

    Get PDF
    Dyslexia is a specific impairment in learning to read and has strong heritability. An intronic deletion within the DCDC2 gene, with ~8% frequency in European populations, is increasingly used as a marker for dyslexia in neuroimaging and behavioral studies. At a mechanistic level, this deletion has been proposed to influence sensory processing capacity, and in particular sensitivity to visual coherent motion. Our re-assessment of the literature, however, did not reveal strong support for a role of this specific deletion in dyslexia. We also analyzed data from five distinct cohorts, enriched for individuals with dyslexia, and did not identify any signal indicative of associations for the DCDC2 deletion with reading-related measures, including in a combined sample analysis (N=526). We believe we conducted the first replication analysis for a proposed deletion effect on visual motion perception and found no association (N=445 siblings). We also report that the DCDC2 deletion has a frequency of 37.6% in a cohort representative of the general population recruited in Hong Kong (N=220). This figure, together with a lack of association between the deletion and reading abilities in this cohort, indicates the low likelihood of a direct deletion effect on reading skills. Therefore, on the basis of multiple strands of evidence, we conclude that the DCDC2 deletion is not a strong risk factor for dyslexia. Our analyses and literature re-evaluation are important for interpreting current developments within multidisciplinary studies of dyslexia and, more generally, contribute to current discussions about the importance of reproducibility in science

    Similarity in Recombination Rate Estimates Highly Correlates with Genetic Differentiation in Humans

    Get PDF
    Recombination varies greatly among species, as illustrated by the poor conservation of the recombination landscape between humans and chimpanzees. Thus, shorter evolutionary time frames are needed to understand the evolution of recombination. Here, we analyze its recent evolution in humans. We calculated the recombination rates between adjacent pairs of 636,933 common single-nucleotide polymorphism loci in 28 worldwide human populations and analyzed them in relation to genetic distances between populations. We found a strong and highly significant correlation between similarity in the recombination rates corrected for effective population size and genetic differentiation between populations. This correlation is observed at the genome-wide level, but also for each chromosome and when genetic distances and recombination similarities are calculated independently from different parts of the genome. Moreover, and more relevant, this relationship is robustly maintained when considering presence/absence of recombination hotspots. Simulations show that this correlation cannot be explained by biases in the inference of recombination rates caused by haplotype sharing among similar populations. This result indicates a rapid pace of evolution of recombination, within the time span of differentiation of modern humans

    North African Influences and Potential Bias in Case-Control Association Studies in the Spanish Population

    Get PDF
    BACKGROUND: Despite the limited genetic heterogeneity of Spanish populations, substantial evidences support that historical African influences have not affected them uniformly. Accounting for such population differences might be essential to reduce spurious results in association studies of genetic factors with disease. Using ancestry informative markers (AIMs), we aimed to measure the African influences in Spanish populations and to explore whether these might introduce statistical bias in population-based association studies. METHODOLOGY/PRINCIPAL FINDINGS: We genotyped 93 AIMs in Spanish (from the Canary Islands and the Iberian Peninsula) and Northwest Africans, and conducted population and individual-based clustering analyses along with reference data from the HapMap, HGDP-CEPH, and other sources. We found significant differences for the Northwest African influence among Spanish populations from as low as ≈ 5% in Spanish from the Iberian Peninsula to as much as ≈ 17% in Canary Islanders, whereas the sub-Saharan African influence was negligible. Strikingly, the Northwest African ancestry showed a wide inter-individual variation in Canary Islanders ranging from 0% to 96%, reflecting the violent way the Islands were conquered and colonized by the Spanish in the XV century. As a consequence, a comparison of allele frequencies between Spanish samples from the Iberian Peninsula and the Canary Islands evidenced an excess of markers with significant differences. However, the inflation of p-values for the differences was adequately controlled by correcting for genetic ancestry estimates derived from a reduced number of AIMs. CONCLUSIONS/SIGNIFICANCE: Although the African influences estimated might be biased due to marker ascertainment, these results confirm that Northwest African genetic footprints are recognizable nowadays in the Spanish populations, particularly in Canary Islanders, and that the uneven African influences existing in these populations might increase the risk for false positives in association studies. Adjusting for population stratification assessed with a few dozen AIMs would be sufficient to control this effect

    Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans

    Get PDF
    The Southern African Human Genome Programme is a national initiative that aspires to unlock the unique genetic character of southern African populations for a better understanding of human genetic diversity. In this pilot study the Southern African Human Genome Programme characterizes the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole-genome sequencing. A total of ~16 million unique variants are identified. Despite the shallow time depth since divergence between the two main southeastern Bantu-speaking groups (Nguni and Sotho-Tswana), principal component analysis and structure analysis reveal significant (p < 10−6) differentiation, and FST analysis identifies regions with high divergence. The Coloured individuals show evidence of varying proportions of admixture with Khoesan, Bantu-speakers, Europeans, and populations from the Indian sub-continent. Whole-genome sequencing data reveal extensive genomic diversity, increasing our understanding of the complex and region-specific history of African populations and highlighting its potential impact on biomedical research and genetic susceptibility to disease

    Genetic Crossovers Are Predicted Accurately by the Computed Human Recombination Map

    Get PDF
    Hotspots of meiotic recombination can change rapidly over time. This instability and the reported high level of inter-individual variation in meiotic recombination puts in question the accuracy of the calculated hotspot map, which is based on the summation of past genetic crossovers. To estimate the accuracy of the computed recombination rate map, we have mapped genetic crossovers to a median resolution of 70 Kb in 10 CEPH pedigrees. We then compared the positions of crossovers with the hotspots computed from HapMap data and performed extensive computer simulations to compare the observed distributions of crossovers with the distributions expected from the calculated recombination rate maps. Here we show that a population-averaged hotspot map computed from linkage disequilibrium data predicts well present-day genetic crossovers. We find that computed hotspot maps accurately estimate both the strength and the position of meiotic hotspots. An in-depth examination of not-predicted crossovers shows that they are preferentially located in regions where hotspots are found in other populations. In summary, we find that by combining several computed population-specific maps we can capture the variation in individual hotspots to generate a hotspot map that can predict almost all present-day genetic crossovers

    Meta-analysis of five genome-wide association studies identifies multiple new loci associated with testicular germ cell tumor

    Get PDF
    The international Testicular Cancer Consortium (TECAC) combined five published genome-wide association studies of testicular germ cell tumor (TGCT; 3,558 cases and 13,970 controls) to identify new susceptibility loci. We conducted a fixed-effects meta-analysis, including, to our knowledge, the first analysis of the X chromosome. Eight new loci mapping to 2q14.2, 3q26.2, 4q35.2, 7q36.3, 10q26.13, 15q21.3, 15q22.31, and Xq28 achieved genome-wide significance (P < 5 × 10−8). Most loci harbor biologically plausible candidate genes. We refined previously reported associations at 9p24.3 and 19p12 by identifying one and three additional independent SNPs, respectively. In aggregate, the 39 independent markers identified to date explain 37% of father-to-son familial risk, 8% of which can be attributed to the 12 new signals reported here. Our findings substantially increase the number of known TGCT susceptibility alleles, move the field closer to a comprehensive understanding of the underlying genetic architecture of TGCT, and provide further clues to the etiology of TGCT

    Mutational signatures in esophageal adenocarcinoma define etiologically distinct subgroups with therapeutic relevance.

    Get PDF
    Esophageal adenocarcinoma (EAC) has a poor outcome, and targeted therapy trials have thus far been disappointing owing to a lack of robust stratification methods. Whole-genome sequencing (WGS) analysis of 129 cases demonstrated that this is a heterogeneous cancer dominated by copy number alterations with frequent large-scale rearrangements. Co-amplification of receptor tyrosine kinases (RTKs) and/or downstream mitogenic activation is almost ubiquitous; thus tailored combination RTK inhibitor (RTKi) therapy might be required, as we demonstrate in vitro. However, mutational signatures showed three distinct molecular subtypes with potential therapeutic relevance, which we verified in an independent cohort (n = 87): (i) enrichment for BRCA signature with prevalent defects in the homologous recombination pathway; (ii) dominant T>G mutational pattern associated with a high mutational load and neoantigen burden; and (iii) C>A/T mutational pattern with evidence of an aging imprint. These subtypes could be ascertained using a clinically applicable sequencing strategy (low coverage) as a basis for therapy selection.Whole-genome sequencing of esophageal adenocarcinoma samples was performed as part of the International Cancer Genome Consortium (ICGC) through the oEsophageal Cancer Clinical and Molecular Stratification (OCCAMS) Consortium and was funded by Cancer Research UK. We thank the ICGC members for their input on verification standards as part of the benchmarking exercise. We thank the Human Research Tissue Bank, which is supported by the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre, from Addenbrooke’s Hospital and UCL. Also the University Hospital of Southampton Trust and the Southampton, Birmingham, Edinburgh and UCL Experimental Cancer Medicine Centres and the QEHB charities. This study was partly funded by a project grant from Cancer Research UK. R.C.F. is funded by an NIHR Professorship and receives core funding from the Medical Research Council and infrastructure support from the Biomedical Research Centre and the Experimental Cancer Medicine Centre. We acknowledge the support of The University of Cambridge, Cancer Research UK (C14303/A17197) and Hutchison Whampoa Limited. We would like to thank Dr. Peter Van Loo for providing the NGS version of ASCAT for copy number calling. We are grateful to all the patients who provided written consent for participation in this study and the staff at all participating centres. Some of the work was undertaken at UCLH/UCL who received a proportion of funding from the Department of Health’s NIHR Biomedical Research Centres funding scheme. The work at UCLH/UCL was also supported by the CRUK UCL Early Cancer Medicine Centre.This is the author accepted manuscript. The final version is available from Nature Publishing Group via http://dx.doi.org/10.1038/ng.365
    • …
    corecore