81 research outputs found

    A novel locus of resistance to severe malaria in a region of ancient balancing selection.

    Get PDF
    The high prevalence of sickle haemoglobin in Africa shows that malaria has been a major force for human evolutionary selection, but surprisingly few other polymorphisms have been proven to confer resistance to malaria in large epidemiological studies. To address this problem, we conducted a multi-centre genome-wide association study (GWAS) of life-threatening Plasmodium falciparum infection (severe malaria) in over 11,000 African children, with replication data in a further 14,000 individuals. Here we report a novel malaria resistance locus close to a cluster of genes encoding glycophorins that are receptors for erythrocyte invasion by P. falciparum. We identify a haplotype at this locus that provides 33% protection against severe malaria (odds ratio = 0.67, 95% confidence interval = 0.60-0.76, P value = 9.5 × 10(-11)) and is linked to polymorphisms that have previously been shown to have features of ancient balancing selection, on the basis of haplotype sharing between humans and chimpanzees. Taken together with previous observations on the malaria-protective role of blood group O, these data reveal that two of the strongest GWAS signals for severe malaria lie in or close to genes encoding the glycosylated surface coat of the erythrocyte cell membrane, both within regions of the genome where it appears that evolution has maintained diversity for millions of years. These findings provide new insights into the host-parasite interactions that are critical in determining the outcome of malaria infection

    Integrated Polygenic Tool Substantially Enhances Coronary Artery Disease Prediction

    Get PDF
    BACKGROUND: There is considerable interest in whether genetic data can be used to improve standard cardiovascular disease risk calculators, as the latter are routinely used in clinical practice to manage preventative treatment. METHODS: Using the UK Biobank resource, we developed our own polygenic risk score for coronary artery disease (CAD). We used an additional 60 000 UK Biobank individuals to develop an integrated risk tool (IRT) that combined our polygenic risk score with established risk tools (either the American Heart Association/American College of Cardiology pooled cohort equations [PCE] or UK QRISK3), and we tested our IRT in an additional, independent set of 186 451 UK Biobank individuals. RESULTS: The novel CAD polygenic risk score shows superior predictive power for CAD events, compared with other published polygenic risk scores, and is largely uncorrelated with PCE and QRISK3. When combined with PCE into an IRT, it has superior predictive accuracy. Overall, 10.4% of incident CAD cases were misclassified as low risk by PCE and correctly classified as high risk by the IRT, compared with 4.4% misclassified by the IRT and correctly classified by PCE. The overall net reclassification improvement for the IRT was 5.9% (95% CI, 4.7–7.0). When individuals were stratified into age-by-sex subgroups, the improvement was larger for all subgroups (range, 8.3%–15.4%), with the best performance in 40- to 54-year-old men (15.4% [95% CI, 11.6–19.3]). Comparable results were found using a different risk tool (QRISK3) and also a broader definition of cardiovascular disease. Use of the IRT is estimated to avoid up to 12 000 deaths in the United States over a 5-year period. CONCLUSIONS: An IRT that includes polygenic risk outperforms current risk stratification tools and offers greater opportunity for early interventions. Given the plummeting costs of genetic tests, future iterations of CAD risk tools would be enhanced with the addition of a person’s polygenic risk

    Identifying Selected Regions from Heterozygosity and Divergence Using a Light-Coverage Genomic Dataset from Two Human Populations

    Get PDF
    When a selective sweep occurs in the chromosomal region around a target gene in two populations that have recently separated, it produces three dramatic genomic consequences: 1) decreased multi-locus heterozygosity in the region; 2) elevated or diminished genetic divergence (FST) of multiple polymorphic variants adjacent to the selected locus between the divergent populations, due to the alternative fixation of alleles; and 3) a consequent regional increase in the variance of FST (S2FST) for the same clustered variants, due to the increased alternative fixation of alleles in the loci surrounding the selection target. In the first part of our study, to search for potential targets of directional selection, we developed and validated a resampling-based computational approach; we then scanned an array of 31 different-sized moving windows of SNP variants (5–65 SNPs) across the human genome in a set of European and African American population samples with 183,997 SNP loci after correcting for the recombination rate variation. The analysis revealed 180 regions of recent selection with very strong evidence in either population or both. In the second part of our study, we compared the newly discovered putative regions to those sites previously postulated in the literature, using methods based on inspecting patterns of linkage disequilibrium, population divergence and other methodologies. The newly found regions were cross-validated with those found in nine other studies that have searched for selection signals. Our study was replicated especially well in those regions confirmed by three or more studies. These validated regions were independently verified, using a combination of different methods and different databases in other studies, and should include fewer false positives. The main strength of our analysis method compared to others is that it does not require dense genotyping and therefore can be used with data from population-based genome SNP scans from smaller studies of humans or other species

    Interferon lambda 4 impacts the genetic diversity of hepatitis C virus

    Get PDF
    Hepatitis C virus (HCV) is a highly variable pathogen that frequently establishes chronic infection. This genetic variability is affected by the adaptive immune response but the contribution of other host factors is unclear. Here, we examined the role played by interferon lambda-4 (IFN-λ4) on HCV diversity; IFN-λ4 plays a crucial role in spontaneous clearance or establishment of chronicity following acute infection. We performed viral genome-wide association studies using human and viral data from 485 patients of white ancestry infected with HCV genotype 3a. We demonstrate that combinations of host genetic variants, which determine IFN-λ4 protein production and activity, influence amino acid variation across the viral polyprotein - not restricted to specific viral proteins or HLA restricted epitopes - and modulate viral load. We also observed an association with viral di-nucleotide proportions. These results support a direct role for IFN-λ4 in exerting selective pressure across the viral genome, possibly by a novel mechanism

    A Histone Map of Human Chromosome 20q13.12

    Get PDF
    We present a systematic search for regulatory elements in a 3.5 Mb region on human chromosome 20q13.12, a region associated with a number of medical conditions such as type II diabetes and obesity.We profiled six histone modifications alongside RNA polymerase II (PolII) and CTCF in two cell lines, HeLa S3 and NTERA-2 clone D1 (NT2/D1), by chromatin immunoprecipitation using an in-house spotted DNA array, constructed with 1.8 kb overlapping plasmid clones. In both cells, more than 90% of transcription start sites (TSSs) of expressed genes showed enrichments with PolII, di-methylated lysine 4 of histone H3 (H3K4me2), tri-methylated lysine 4 of histone H3 (H3K4me3) or acetylated H3 (H3Ac), whereas mono-methylated lysine 4 of histone H3 (H3K4me1) signals did not correlate with expression. No TSSs were enriched with tri-methylated lysine 27 of histone H3 (H3K27me3) in HeLa S3, while eight TSSs (4 expressed) showed enrichments in NT2/D1. We have also located several CTCF binding sites that are potential insulator elements.In summary, we annotated a number of putative regulatory elements in 20q13.12 and went on to verify experimentally a subset of them using dual luciferase reporter assays. Correlating this data to sequence variation can aid identification of disease causing variants

    Genome-wide fine-scale recombination rate variation in Drosophila melanogaster

    Get PDF
    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and diversity

    The Complex Genetic Architecture of the Metabolome

    Get PDF
    Discovering links between the genotype of an organism and its metabolite levels can increase our understanding of metabolism, its controls, and the indirect effects of metabolism on other quantitative traits. Recent technological advances in both DNA sequencing and metabolite profiling allow the use of broad-spectrum, untargeted metabolite profiling to generate phenotypic data for genome-wide association studies that investigate quantitative genetic control of metabolism within species. We conducted a genome-wide association study of natural variation in plant metabolism using the results of untargeted metabolite analyses performed on a collection of wild Arabidopsis thaliana accessions. Testing 327 metabolites against >200,000 single nucleotide polymorphisms identified numerous genotype–metabolite associations distributed non-randomly within the genome. These clusters of genotype–metabolite associations (hotspots) included regions of the A. thaliana genome previously identified as subject to recent strong positive selection (selective sweeps) and regions showing trans-linkage to these putative sweeps, suggesting that these selective forces have impacted genome-wide control of A. thaliana metabolism. Comparing the metabolic variation detected within this collection of wild accessions to a laboratory-derived population of recombinant inbred lines (derived from two of the accessions used in this study) showed that the higher level of genetic variation present within the wild accessions did not correspond to higher variance in metabolic phenotypes, suggesting that evolutionary constraints limit metabolic variation. While a major goal of genome-wide association studies is to develop catalogues of intraspecific variation, the results of multiple independent experiments performed for this study showed that the genotype–metabolite associations identified are sensitive to environmental fluctuations. Thus, studies of intraspecific variation conducted via genome-wide association will require analyses of genotype by environment interaction. Interestingly, the network structure of metabolite linkages was also sensitive to environmental differences, suggesting that key aspects of network architecture are malleable

    Comprehensive Research Synopsis and Systematic Meta-Analyses in Parkinson's Disease Genetics: The PDGene Database

    Get PDF
    More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of ∼27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P<5×10−8) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P = 1.3×10−8). All meta-analysis results are freely available on a dedicated online database (www.pdgene.org), which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies

    A Two-Stage Meta-Analysis Identifies Several New Loci for Parkinson's Disease

    Get PDF
    A previous genome-wide association (GWA) meta-analysis of 12,386 PD cases and 21,026 controls conducted by the International Parkinson's Disease Genomics Consortium (IPDGC) discovered or confirmed 11 Parkinson's disease (PD) loci. This first analysis of the two-stage IPDGC study focused on the set of loci that passed genome-wide significance in the first stage GWA scan. However, the second stage genotyping array, the ImmunoChip, included a larger set of 1,920 SNPs selected on the basis of the GWA analysis. Here, we analyzed this set of 1,920 SNPs, and we identified five additional PD risk loci (combined p<5x10(-10), PARK16/1q32, STX1B/16p11, FGF20/8p22, STBD1/4q21, and GPNMB/7p15). Two of these five loci have been suggested by previous association studies (PARK16/1q32, FGF20/8p22), and this study provides further support for these findings. Using a dataset of post-mortem brain samples assayed for gene expression (n = 399) and methylation (n = 292), we identified methylation and expression changes associated with PD risk variants in PARK16/1q32, GPNMB/7p15, and STX1B/16p11 loci, hence suggesting potential molecular mechanisms and candidate genes at these risk loci
    • …
    corecore