101 research outputs found

    High-depth African genomes inform human migration and health

    Get PDF
    The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals— comprising 50 ethnolinguistic groups, including previously unsampled populations— to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that population from Zambia were a likely intermediate site along the routes of expansion of Bantuspeaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health

    Population-specific common SNPs reflect demographic histories and highlight regions of genomic plasticity with functional relevance

    Get PDF
    Abstract Background Population differentiation is the result of demographic and evolutionary forces. Whole genome datasets from the 1000 Genomes Project (October 2012) provide an unbiased view of genetic variation across populations from Europe, Asia, Africa and the Americas. Common population-specific SNPs (MAF > 0.05) reflect a deep history and may have important consequences for health and wellbeing. Their interpretation is contextualised by currently available genome data. Results The identification of common population-specific (CPS) variants (SNPs and SSV) is influenced by admixture and the sample size under investigation. Nine of the populations in the 1000 Genomes Project (2 African, 2 Asian (including a merged Chinese group) and 5 European) revealed that the African populations (LWK and YRI), followed by the Japanese (JPT) have the highest number of CPS SNPs, in concordance with their histories and given the populations studied. Using two methods, sliding 50-SNP and 5-kb windows, the CPS SNPs showed distinct clustering across large genome segments and little overlap of clusters between populations. iHS enrichment score and the population branch statistic (PBS) analyses suggest that selective sweeps are unlikely to account for the clustering and population specificity. Of interest is the association of clusters close to recombination hotspots. Functional analysis of genes associated with the CPS SNPs revealed over-representation of genes in pathways associated with neuronal development, including axonal guidance signalling and CREB signalling in neurones. Conclusions Common population-specific SNPs are non-randomly distributed throughout the genome and are significantly associated with recombination hotspots. Since the variant alleles of most CPS SNPs are the derived allele, they likely arose in the specific population after a split from a common ancestor. Their proximity to genes involved in specific pathways, including neuronal development, suggests evolutionary plasticity of selected genomic regions. Contrary to expectation, selective sweeps did not play a large role in the persistence of population-specific variation. This suggests a stochastic process towards population-specific variation which reflects demographic histories and may have some interesting implications for health and susceptibility to disease

    Genetic substructure and complex demographic history of South African Bantu speakers

    Get PDF
    Abstract: outh Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ~400 years in certain regions of the country. Simulated trait genomewide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa

    Genome-wide SNP discovery in indigenous cattle breeds of South Africa

    Get PDF
    Single nucleotide polymorphism arrays have created new possibilities for performing genome-wide studies to detect genomic regions harboring sequence variants that affect complex traits. However, the majority of validated SNPs for which allele frequencies have been estimated are limited primarily to European breeds. The objective of this study was to perform SNP discovery in three South African indigenous breeds (Afrikaner, Drakensberger, and Nguni) using whole genome sequencing. DNA was extracted from blood and hair samples, quantified and prepared at 50 ng/ml concentration for sequencing at the Agricultural Research Council Biotechnology Platform using an Illumina HiSeq 2500. The fastq files were used to call the variants using the Genome Analysis Tool Kit. A total of 1,678,360 were identified as novel using Run 6 of 1000 Bull Genomes Project. Annotation of the identified variants classified them into functional categories. Within the coding regions, about 30% of the SNPs were non-synonymous substitutions that encode for alternate amino acids. The study of distribution of SNP across the genome identified regions showing notable differences in the densities of SNPs among the breeds and highlighted many regions of functional significance. Gene ontology terms identified genes such as MLANA, SYT10, and CDC42EP5 that have been associated with coat color in mouse, and ADAMS3, DNAJC3, and PAG5 genes have been associated with fertility in cattle. Further analysis of the variants detected 688 candidate selective sweeps (ZHp Z-scores 4) across all three breeds, of which 223 regions were assigned as being putative selective sweeps (ZHp scores 5). We also identified 96 regions with extremely low ZHp Z-scores ( 6) in Afrikaner and Nguni. Genes such as KIT and MITF that have been associated with skin pigmentation in cattle and CACNA1C, which has been associated with biopolar disorder in human, were identified in these regions. This study provides the first analysis of sequence data to discover SNPs in indigenous South African cattle breeds. The information will play an important role in our efforts to understand the genetic history of our cattle and in designing appropriate breed improvement programmes.The Red Meat Research and Development of South Africa (RMRDSA), South Africa Beef Genomic Project (BGP), and the National Research Foundation (NRF).http://www.frontiersin.org/Geneticsam2019Animal and Wildlife Science

    Genetic-substructure and complex demographic history of South African Bantu speakers

    Full text link
    South Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ∼400 years in certain regions of the country. Simulated trait genome-wide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa

    Genetic insights into smoking behaviours in 10,558 men of African ancestry from continental Africa and the UK

    Get PDF
    Smoking is a leading risk factor for many of the top ten causes of death worldwide. Of the 1.3 billion smokers globally, 80% live in low- and middle-income countries, where the number of deaths due to tobacco use is expected to double in the next decade according to the World Health Organization. Genetic studies have helped to identify biological pathways for smoking behaviours, but have mostly focussed on individuals of European ancestry or living in either North America or Europe. We performed a genome-wide association study of two smoking behaviour traits in 10,558 men of African ancestry living in five African countries and the UK. Eight independent variants were associated with either smoking initiation or cessation at P-value < 5 × 10-6, four being monomorphic or rare in European populations. Gene prioritisation strategy highlighted five genes, including SEMA6D, previously described as associated with several smoking behaviour traits. These results confirm the importance of analysing underrepresented populations in genetic epidemiology, and the urgent need for larger genomic studies to boost discovery power to better understand smoking behaviours, as well as many other traits

    The African Genome Variation Project shapes medical genetics in Africa.

    Get PDF
    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa

    High-depth African genomes inform human migration and health.

    Get PDF
    The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals-comprising 50 ethnolinguistic groups, including previously unsampled populations-to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon-but in other genes, variants denoted as 'likely pathogenic' in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health

    High-depth African genomes inform human migration and health

    Get PDF
    The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals—comprising 50 ethnolinguistic groups, including previously unsampled populations—to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis

    Get PDF
    Abstract Background Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk
    • …
    corecore