50 research outputs found
Complex Genetic History of East African Human Populations
Results from disparate fields indicate that anatomically modern Homo sapiens originated in Africa ~200 thousand years ago (kya), and that East Africa is the likely source of the migration of modern humans out of Africa within the past 100 thousand years. However, the genetic diversity currently found in Africa, and especially East Africa, has not been well studied compared to non-African populations, due in large part to the fact that DNA samples from many remote regions of Africa are currently not available. The goal of this study was, therefore, to characterize genetic variation in previously unstudied East African human populations using mitochondrial and Y chromosome DNA from 1500 individuals. These data were then compared to independently collected data of the same populations from ~1327 nuclear markers (848 microsatellites and 479 insertion/deletion polymorphisms). The data were used to gain insight into patterns of genetic diversity, to construct past relationships of East African populations to each other and to other African populations, to clarify historical demographic events such as population expansion, contraction, and migration that these populations might have experienced. Several independent analyses showed significant relationships between genetic and geographic/linguistic distances among East African populations. Genetic variation is more strongly correlated with geography than is linguistics. Overall, the correlations between genetic versus geographic/ linguistic variation is stronger for autosomal and Y chromosome than for mtDNA lineages. Y chromosome and mtDNA lineage distributions seem to cluster geographically and for some lineages, linguistically. Two major migration events, namely the migration of Bantu-speaking populations from Central/West Africa across sub-Saharan Africa and the migration of pastoralist populations from Sudan and Ethiopia within the past 5000 years have had a major influence on extant genetic patterns in East Africa
GEneSTATION 1.0: A Synthetic Resource of Diverse Evolutionary and Functional Genomic Data for Studying The Evolution of Pregnancy-Associated Tissues and Phenotypes
Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy
Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts
Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.</p
Novel Ancestry-Specific Primary Open-Angle Glaucoma Loci and Shared Biology With Vascular Mechanisms and Cell Proliferation
Primary open-angle glaucoma (POAG), a leading cause of irreversible blindness globally, shows disparity in prevalence and manifestations across ancestries. We perform meta-analysis across 15 biobanks (of the Global Biobank Meta-analysis Initiative) (n = 1,487,441: cases = 26,848) and merge with previous multi-ancestry studies, with the combined dataset representing the largest and most diverse POAG study to date (n = 1,478,037: cases = 46,325) and identify 17 novel significant loci, 5 of which were ancestry specific. Gene-enrichment and transcriptome-wide association analyses implicate vascular and cancer genes, a fifth of which are primary ciliary related. We perform an extensive statistical analysis of SIX6 and CDKN2B-AS1 loci in human GTEx data and across large electronic health records showing interaction between SIX6 gene and causal variants in the chr9p21.3 locus, with expression effect on CDKN2A/B. Our results suggest that some POAG risk variants may be ancestry specific, sex specific, or both, and support the contribution of genes involved in programmed cell death in POAG pathogenesis
Global Biobank Meta-analysis Initiative:Powering genetic discovery across human disease
Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)—a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.</p
Leveraging global multi-ancestry meta-analysis in the study of idiopathic pulmonary fibrosis genetics
The research of rare and devastating orphan diseases, such as idiopathic pulmonary fibrosis (IPF) has been limited by the rarity of the disease itself. The prognosis is poor—the prevalence of IPF is only approximately four times the incidence, limiting the recruitment of patients to trials and studies of the underlying biology. Global biobanking efforts can dramatically alter the future of IPF research. We describe a large-scale meta-analysis of IPF, with 8,492 patients and 1,355,819 population controls from 13 biobanks around the globe. Finally, we combine this meta-analysis with the largest available meta-analysis of IPF, reaching 11,160 patients and 1,364,410 population controls. We identify seven novel genome-wide significant loci, only one of which would have been identified if the analysis had been limited to European ancestry individuals. We observe notable pleiotropy across IPF susceptibility and severe COVID-19 infection and note an unexplained sex-heterogeneity effect at the strongest IPF locus MUC5B.publishedVersionPeer reviewe
The Peopling of the African Continent and the Diaspora into the New World
Africa is the birthplace of anatomically modern humans, and is the geographic origin of human migration across the globe within the last 100,000 years. The history of African populations has consisted of a number of demographic events that have influenced patterns of genetic and phenotypic variation across the continent. With the increasing amount of genomic data and corresponding developments in computational methods, researchers are able to explore long-standing evolutionary questions, expanding our understanding of human history within and outside of Africa. This review will summarize some of the recent findings regarding African demographic history, including the African Diaspora, and will briefly explore their implications for disease susceptibility in populations of African descent
Overlap between fast evolving genes and those that exhibit tissue enrichment in their expression.
<p>Only tissues with significant representation factors greater than one are shown, out of 24 tissues evaluated in the Protein Atlas database with more than five enriched genes. Values are genes, with associated representation factors in parentheses and asterisk for values significantly > 1.</p