79 research outputs found

    Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands’

    Get PDF
    Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with ‘true’ genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed s

    Worm variation made accessible: Take your shopping cart to store, link, and investigate!

    Get PDF
    In Caenorhabditis elegans, the recent advances in high-throughput quantitative analyses of natural genetic and phenotypic variation have led to a wealth of data on genotype phenotype relations. This data has resulted in the discovery of genes with major allelic effects and insights in the effect of natural genetic variation on a whole range of complex traits as well as how this variation is distributed across the genome. Regardless of the advances presented in specific studies, the majority of the data generated in these studies had yet to be made easily accessible, allowing for meta-analysis. Not only data in figures or tables but meta-data should be accessible for further investigation and comparison between studies. A platform was created where all the data, phenotypic measurements, genotypes, and mappings can be stored, compared, and new linkages within and between published studies can be discovered. WormQTL focuses on quantitative genetics in Caenorhabditis and other nematode species, whereas WormQTLHD quantitatively links gene expression quantitative trait loci (eQTL) in C. elegans to gene–disease associations in human

    Towards FAIRification of sensitive and fragmented rare disease patient data:challenges and solutions in European reference network registries

    Get PDF
    INTRODUCTION: Rare disease patient data are typically sensitive, present in multiple registries controlled by different custodians, and non-interoperable. Making these data Findable, Accessible, Interoperable, and Reusable (FAIR) for humans and machines at source enables federated discovery and analysis across data custodians. This facilitates accurate diagnosis, optimal clinical management, and personalised treatments. In Europe, twenty-four European Reference Networks (ERNs) work on rare disease registries in different clinical domains. The process and the implementation choices for making data FAIR (‘FAIRification’) differ among ERN registries. For example, registries use different software systems and are subject to different legal regulations. To support the ERNs in making informed decisions and to harmonise FAIRification, the FAIRification steward team was established to work as liaisons between ERNs and researchers from the European Joint Programme on Rare Diseases. RESULTS: The FAIRification steward team inventoried the FAIRification challenges of the ERN registries and proposed solutions collectively with involved stakeholders to address them. Ninety-eight FAIRification challenges from 24 ERNs’ registries were collected and categorised into “training” (31), “community” (9), “modelling” (12), “implementation” (26), and “legal” (20). After curating and aggregating highly similar challenges, 41 unique FAIRification challenges remained. The two categories with the most challenges were “training” (15) and “implementation” (9), followed by “community” (7), and then “modelling” (5) and “legal” (5). To address all challenges, eleven types of solutions were proposed. Among them, the provision of guidelines and the organisation of training activities resolved the “training” challenges, which ranged from less-technical “coffee-rounds” to technical workshops, from informal FAIR Games to formal hackathons. Obtaining implementation support from technical experts was the solution type for tackling the “implementation” challenges. CONCLUSION: This work shows that a dedicated team of FAIR data stewards is an asset for harmonising the various processes of making data FAIR in a large organisation with multiple stakeholders. Additionally, multi-levelled training activities are required to accommodate the diverse needs of the ERNs. Finally, the lessons learned from the experience of the FAIRification steward team described in this paper may help to increase FAIR awareness and provide insights into FAIRification challenges and solutions of rare disease registries. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-022-02558-5

    WormQTL HD—a web database for linking human disease to natural variation data in C. elegans

    Get PDF
    Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism— Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench

    Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution

    Get PDF
    We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking

    1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.

    Get PDF
    HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10(-8) previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples

    A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    Get PDF
    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals

    Heritability estimates for 361 blood metabolites across 40 genome-wide association studies

    Get PDF
    Metabolomics examines the small molecules involved in cellular metabolism. Approximately 50% of total phenotypic differences in metabolite levels is due to genetic variance, but heritability estimates differ across metabolite classes. We perform a review of all genome-wide association and (exome-) sequencing studies published between November 2008 and October 2018, and identify >800 class-specific metabolite loci associated with metabolite levels. In a twin-family cohort (N = 5117), these metabolite loci are leveraged to simultaneously estimate total heritability (h2 total), and the proportion of heritability captured by known metabolite loci (h2 Metabolite-hits) for 309 lipids and

    WGS-based telomere length analysis in Dutch family trios implicates stronger maternal inheritance and a role for RRM1 gene

    Get PDF
    Telomere length (TL) regulation is an important factor in ageing, reproduction and cancer development. Genetic, hereditary and environmental factors regulating TL are currently widely investigated, however, their relative contribution to TL variability is still understudied. We have used whole genome sequencing data of 250 family trios from the Genome of the Netherlands project to perform computational measurement of TL and a series of regression and genome-wide association analyses to reveal TL inheritance patterns and associated genetic factors. Our results confirm that TL is a largely heritable trait, primarily with mother’s, and, to a lesser extent, with father’s TL having the strongest influence on the offspring. In this cohort, mother’s, but not father’s age at conception was positively linked to offspring TL. Age-related TL attrition of 40 bp/year had relatively small influence on TL variability. Finally, we have identified TL-associated variations in ribonuclease reductase catalytic subunit M1 (RRM1 gene), which is known to regulate telomere maintenance in yeast. We also highlight the importance of multivariate approach and the limitations of existing tools for the analysis of TL as a polygenic heritable quantitative trait

    Genome-wide analyses identify a role for SLC17A4 and AADAT in thyroid hormone regulation.

    Get PDF
    Thyroid dysfunction is an important public health problem, which affects 10% of the general population and increases the risk of cardiovascular morbidity and mortality. Many aspects of thyroid hormone regulation have only partly been elucidated, including its transport, metabolism, and genetic determinants. Here we report a large meta-analysis of genome-wide association studies for thyroid function and dysfunction, testing 8 million genetic variants in up to 72,167 individuals. One-hundred-and-nine independent genetic variants are associated with these traits. A genetic risk score, calculated to assess their combined effects on clinical end points, shows significant associations with increased risk of both overt (Graves' disease) and subclinical thyroid disease, as well as clinical complications. By functional follow-up on selected signals, we identify a novel thyroid hormone transporter (SLC17A4) and a metabolizing enzyme (AADAT). Together, these results provide new knowledge about thyroid hormone physiology and disease, opening new possibilities for therapeutic targets
    corecore