9 research outputs found

    Leveraging DNA-Methylation Quantitative-Trait Loci to Characterize the Relationship between Methylomic Variation, Gene Expression, and Complex Traits

    Get PDF
    Characterizing the complex relationship between genetic, epigenetic, and transcriptomic variation has the potential to increase understanding about the mechanisms underpinning health and disease phenotypes. We undertook a comprehensive analysis of common genetic variation on DNA methylation (DNAm) by using the Illumina EPIC array to profile samples from the UK Household Longitudinal study. We identified 12,689,548 significant DNA methylation quantitative trait loci (mQTL) associations (p 60 human traits by using summary-data-based Mendelian randomization (SMR) to identify 1,662 pleiotropic associations between 36 complex traits and 1,246 DNAm sites. We also use SMR to characterize the relationship between DNAm and gene expression and thereby identify 6,798 pleiotropic associations between 5,420 DNAm sites and the transcription of 1,702 genes. Our mQTL database and SMR results are available via a searchable online database as a resource to the research community

    Bigmelon:Tools for analysing large DNA methylation datasets

    Get PDF
    Motivation The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. Results Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. copy; 2018 The Author(s). Published by Oxford University Press.</p

    Guidance for DNA methylation studies: statistical insights from the Illumina EPIC array

    Get PDF
    Background There has been a steady increase in the number of studies aiming to identify DNA methylation differences associated with complex phenotypes. Many of the challenges of epigenetic epidemiology regarding study design and interpretation have been discussed in detail, however there are analytical concerns that are outstanding and require further exploration. In this study we seek to address three analytical issues. First, we quantify the multiple testing burden and propose a standard statistical significance threshold for identifying DNA methylation sites that are associated with an outcome. Second, we establish whether linear regression, the chosen statistical tool for the majority of studies, is appropriate and whether it is biased by the underlying distribution of DNA methylation data. Finally, we assess the sample size required for adequately powered DNA methylation association studies. Results We quantified DNA methylation in the Understanding Society cohort (n = 1175), a large population based study, using the Illumina EPIC array to assess the statistical properties of DNA methylation association analyses. By simulating null DNA methylation studies, we generated the distribution of p-values expected by chance and calculated the 5% family-wise error for EPIC array studies to be 9 × 10⁻⁸. Next, we tested whether the assumptions of linear regression are violated by DNA methylation data and found that the majority of sites do not satisfy the assumption of normal residuals. Nevertheless, we found no evidence that this bias influences analyses by increasing the likelihood of affected sites to be false positives. Finally, we performed power calculations for EPIC based DNA methylation studies, demonstrating that existing studies with data on ~ 1000 samples are adequately powered to detect small differences at the majority of sites. Conclusion We propose that a significance threshold of P < 9 × 10⁻⁸ adequately controls the false positive rate for EPIC array DNA methylation studies. Moreover, our results indicate that linear regression is a valid statistical methodology for DNA methylation studies, despite the fact that the data do not always satisfy the assumptions of this test. These findings have implications for epidemiological-based studies of DNA methylation and provide a framework for the interpretation of findings from current and future studies

    Systematic under-estimation of the epigenetic clock and age acceleration in older subjects

    Get PDF
    Background: The Horvath epigenetic clock is widely used. It predicts age quite well from 353 CpG sites in the DNA methylation profile in unknown samples and has been used to calculate 'age acceleration’ in various tissues and environments. Results: The model systematically underestimates age in tissues from older people. This is seen in all examined tissues but most strongly in the cerebellum and is consistently observed in multiple datasets. Age acceleration is thus age-dependent, and this can lead to spurious associations. The current literature includes examples of association tests with age acceleration calculated in a wide variety of ways. Conclusions: The concept of an epigenetic clock is compelling, but caution should be taken in interpreting associations with age acceleration. Association tests of age acceleration should include age as a covariate

    InterpolatedXY: a two-step strategy to normalise DNA methylation microarray data avoiding sex bias

    Get PDF
    Motivation Data normalization is an essential step to reduce technical variation within and between arrays. Due to the different karyotypes and the effects of X chromosome inactivation, females and males exhibit distinct methylation patterns on sex chromosomes; thus, it poses a significant challenge to normalize sex chromosome data without introducing bias. Currently, existing methods do not provide unbiased solutions to normalize sex chromosome data, usually, they just process autosomal and sex chromosomes indiscriminately. Results Here, we demonstrate that ignoring this sex difference will lead to introducing artificial sex bias, especially for thousands of autosomal CpGs. We present a novel two-step strategy (interpolatedXY) to address this issue, which is applicable to all quantile-based normalization methods. By this new strategy, the autosomal CpGs are first normalized independently by conventional methods, such as funnorm or dasen; then the corrected methylation values of sex chromosome-linked CpGs are estimated as the weighted average of their nearest neighbors on autosomes. The proposed two-step strategy can also be applied to other non-quantile-based normalization methods, as well as other array-based data types. Moreover, we propose a useful concept: the sex explained fraction of variance, to quantitatively measure the normalization effect. Availability and implementation The proposed methods are available by calling the function ‘adjustedDasen’ or ‘adjustedFunnorm’ in the latest wateRmelon package (https://github.com/schalkwyk/wateRmelon), with methods compatible with all the major workflows, including minfi

    The DNA methylome of human sperm is distinct from blood with little evidence for tissue-consistent obesity associations

    Get PDF
    Epidemiological research suggests that paternal obesity may increase the risk of fathering small for gestational age offspring. Studies in non-human mammals indicate that such associations could be mediated by DNA methylation changes in spermatozoa that influence offspring development in utero. Human obesity is associated with differential DNA methylation in peripheral blood. It is unclear, however, whether this differential DNA methylation is reflected in spermatozoa. We profiled genome-wide DNA methylation using the Illumina MethylationEPIC array in a cross-sectional study of matched human blood and sperm from lean (discovery n = 47; replication n = 21) and obese (n = 22) males to analyse tissue covariation of DNA methylation, and identify obesity-associated methylomic signatures. We found that DNA methylation signatures of human blood and spermatozoa are highly discordant, and methylation levels are correlated at only a minority of CpG sites (~1%). At the majority of these sites, DNA methylation appears to be influenced by genetic variation. Obesity-associated DNA methylation in blood was not generally reflected in spermatozoa, and obesity was not associated with altered covariation patterns or accelerated epigenetic ageing in the two tissues. However, one cross-tissue obesity-specific hypermethylated site (cg19357369; chr4:2429884; P = 8.95 × 10−8; 2% DNA methylation difference) was identified, warranting replication and further investigation. When compared to a wide range of human somatic tissue samples (n = 5,917), spermatozoa displayed differential DNA methylation across pathways enriched in transcriptional regulation. Overall, human sperm displays a unique DNA methylation profile that is highly discordant to, and practically uncorrelated with, that of matched peripheral blood. We observed that obesity was only nominally associated with differential DNA methylation in sperm, and therefore suggest that spermatozoal DNA methylation is an unlikely mediator of intergenerational effects of metabolic traits

    Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation

    Get PDF
    Characterizing genetic influences on DNA methylation (DNAm) provides an opportunity to understand mechanisms underpinning gene regulation and disease. In the present study, we describe results of DNAm quantitative trait locus (mQTL) analyses on 32,851 participants, identifying genetic variants associated with DNAm at 420,509 DNAm sites in blood. We present a database of >270,000 independent mQTLs, of which 8.5% comprise long-range (trans) associations. Identified mQTL associations explain 15–17% of the additive genetic variance of DNAm. We show that the genetic architecture of DNAm levels is highly polygenic. Using shared genetic control between distal DNAm sites, we constructed networks, identifying 405 discrete genomic communities enriched for genomic annotations and complex traits. Shared genetic variants are associated with both DNAm levels and complex diseases, but only in a minority of cases do these associations reflect causal relationships from DNAm to trait or vice versa, indicating a more complex genotype–phenotype map than previously anticipated

    SARS-CoV-2 infects the human kidney and drives fibrosis in kidney organoids

    No full text
    This work was supported by grants of the German Research Foundation (DFG: KR 4073/11-1; SFBTRR219, 322900939; and CRU344, 428857858, and CRU5011 InteraKD 445703531), a grant of the European Research Council (ERC-StG 677448), the Federal Ministry of Research and Education (BMBF NUM-COVID19, Organo-Strat 01KX2021), the Dutch Kidney Foundation (DKF) TASK FORCE consortium (CP1805), the Else Kroener Fresenius Foundation (2017_A144), and the ERA-CVD MENDAGE consortium (BMBF 01KL1907) all to R.K.; DFG (CRU 344, Z to I.G.C and CRU344 P2 to R.K.S.); and the BMBF eMed Consortium Fibromap (to V.G.P, R.K., R.K.S., and I.G.C.). R.K.S received support from the KWF Kankerbestrijding (11031/2017–1, Bas Mulder Award) and a grant by the ERC (deFiber; ERC-StG 757339). J.J. is supported by the Netherlands Organisation for Scientific Research (NWO Veni grant no: 091 501 61 81 01 36) and the DKF (grant no. 19OK005). B.S. is supported by the DKF (grant: 14A3D104) and the NWO (VIDI grant: 016.156.363). R.P.V.R. and G.J.O. are supported by the NWO VICI (grant: 16.VICI.170.090). P.B. is supported by the BMBF (DEFEAT PANDEMIcs, 01KX2021), the Federal Ministry of Health (German Registry for COVID-19 Autopsies-DeRegCOVID, www.DeRegCOVID.ukaachen.de; ZMVI1-2520COR201), and the German Research Foundation (DFG; SFB/TRR219 Project-IDs 322900939 and 454024652). S.D. received DFG support (DJ100/1-1) as well as support from VGP and TBH (SFB1192). M.d.B,R.R., N.S., and A.A. are supported by an ERC Advanced Investigator grant (H2020-ERC-2017-ADV-788982-COLMIN) to N.S. A.A. is supported by the NWO (VI.Veni.192.094). We thank Saskia de Wildt, Jeanne Pertijs (Radboudumc, Department of Pharmacology), and Robert M. Verdijk (Erasmus Medical Center, Department of Pathology) for providing tissue controls (Erasmus MC Tissue Bank) and Christian Drosten (Charite´ Universitatsmedizin Berlin, Institute of € Virology) and Bart Haagmans (Erasmus Medical Center, Rotterdam) for providing the SARS-CoV-2 isolate. We thank Kioa L. Wijnsma (Department of Pediatric Nephrology, Radboud Institute for Molecular Life Sciences, Amalia Children’s Hospital, Radboud University Medical Center) for support with statistical analysis regarding the COVID-19 patient cohort.Peer reviewedPublisher PD
    corecore