49 research outputs found

    Enhanced cell deconvolution of peripheral blood using DNA methylation for high-resolution immune profiling

    Get PDF
    DNA methylation microarrays can be employed to interrogate cell-type composition in complex tissues. Here, we expand reference-based deconvolution of blood DNA methylation to include 12 leukocyte subtypes (neutrophils, eosinophils, basophils, monocytes, naïve and memory B cells, naïve and memory CD4 + and CD8 + T cells, natural killer, and T regulatory cells). Including derived variables, our method provides 56 immune profile variables. The IDOL (IDentifying Optimal Libraries) algorithm was used to identify libraries for deconvolution of DNA methylation data for current and previous platforms. The accuracy of deconvolution estimates obtained using our enhanced libraries was validated using artificial mixtures and whole-blood DNA methylation with known cellular composition from flow cytometry. We applied our libraries to deconvolve cancer, aging, and autoimmune disease datasets. In conclusion, these libraries enable a detailed representation of immune-cell profiles in blood using only DNA and facilitate a standardized, thorough investigation of immune profiles in human health and disease

    Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data

    Get PDF
    Background: Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting information, statistical methods can be applied to deconvolve heterogeneous cell mixtures. Among these methods, reference-based approaches leverage age-appropriate cell-specific DNAm profiles to estimate cellular composition. In UCB, four reference datasets comprising DNAm signatures profiled in purified cell populations have been published using the Illumina 450 K and EPIC arrays. These datasets are biologically and technically different, and currently, there is no consensus on how to best apply them. Here, we systematically evaluate and compare these datasets and provide recommendations for reference-based UCB deconvolution. Results: We first evaluated the four reference datasets to ascertain both the purity of the samples and the potential cell cross-contamination. We filtered samples and combined datasets to obtain a joint UCB reference. We selected deconvolution libraries using two different approaches: automatic selection using the top differentially methylated probes from the function pickCompProbes in minfi and a standardized library selected using the IDOL (Identifying Optimal Libraries) iterative algorithm. We compared the performance of each reference separately and in combination, using the two approaches for reference library selection, and validated the results in an independent cohort (Generation R Study, n = 191) with matched Fluorescence-Activated Cell Sorting measured cell counts. Strict filtering and combination of the references significantly improved the accuracy and efficiency of cell type estimates. Ultimately, the IDOL library outperformed the library from the automatic selection method implemented in pickCompProbes. Conclusion: These results have important implications for epigenetic studies in UCB as implementing this method will optimally reduce confounding due to cellular heterogeneity. This work provides guidelines for future referencebased UCB deconvolution and establishes a framework for combining reference datasets in other tissues

    Cell-type deconvolution in epigenome-wide association studies: a review and recommendations

    Get PDF
    A major challenge faced by epigenome-wide association studies (EWAS) is cell-type heterogeneity. As many EWAS have already demonstrated, adjusting for changes in cell-type composition can be critical when analyzing and interpreting findings from such studies. Because of their importance, a great number of different statistical algorithms, which adjust for cell-type composition, have been proposed. Some of the methods are ‘reference based’ in that they require a priori defined reference DNA methylation profiles of cell types that are present in the tissue of interest, while other algorithms are ‘reference free.’ At present, however, it is unclear how best to adjust for cell-type heterogeneity, as this may also largely depend on the type of tissue and phenotype being considered. Here, we provide a critical review of the major existing algorithms for correcting cell-type composition in the context of Illumina Infinium Methylation Beadarrays, with the aim of providing useful recommendations to the EWAS community

    Quantifying the proportion of different cell types in the human cortex using DNA methylation profiles

    Get PDF
    This is the final version. Available from BMC via the DOI in this record. Availability of data and materials: All data generated or analysed during this study are included in this published article, its supplementary information files and publicly available repositories. Data generated for this project are available at NCBI Gene Express Omnibus (GEO) under accession number GSE234520 [64]. We also reanalysed data previously made available via GEO (via accession numbers GSE74193 [65], GSE59685 [66], GSE80970 [67], GSE88890 [68], GSE43414 [69]) and the synapse platform (syn7072866 [70], syn8263588 [71]). Code for the analyses presented here can be found on GitHub and Zenodo https://github.com/ejh243/BrainFANS/tree/master/array/DNAm/preprocessing (https://doi.org/https://doi.org/10.5281/zenodo.10402167). Specifically, code for the quality control of the DNAm data can be found at https://github.com/ejh243/BrainFANS/tree/master/array/DNAm/preprocessing and the code for the statistical analyses can be found at https://github.com/ejh243/BrainFANS/tree/master/array/DNAm/analysis/neuralCellComposition. Our new trained deconvolution models for brain are made available to the wider research community via our R package CETYGO available on GitHub (https://github.com/ds420/CETYGO; https://doi.org/10.5281/zenodo.10418430).Background: Due to interindividual variation in the cellular composition of the human cortex, it is essential that covariates that capture these differences are included in epigenome-wide association studies using bulk tissue. As experimentally derived cell counts are often unavailable, computational solutions have been adopted to estimate the proportion of different cell types using DNA methylation data. Here, we validate and profile the use of an expanded reference DNA methylation dataset incorporating two neuronal and three glial cell subtypes for quantifying the cellular composition of the human cortex. Results: We tested eight reference panels containing different combinations of neuronal- and glial cell types and characterised their performance in deconvoluting cell proportions from computationally reconstructed or empirically derived human cortex DNA methylation data. Our analyses demonstrate that while these novel brain deconvolution models produce accurate estimates of cellular proportions from profiles generated on postnatal human cortex samples, they are not appropriate for the use in prenatal cortex or cerebellum tissue samples. Applying our models to an extensive collection of empirical datasets, we show that glial cells are twice as abundant as neuronal cells in the human cortex and identify significant associations between increased Alzheimer’s disease neuropathology and the proportion of specific cell types including a decrease in NeuNNeg/SOX10Neg nuclei and an increase of NeuNNeg/SOX10Pos nuclei. Conclusions: Our novel deconvolution models produce accurate estimates for cell proportions in the human cortex. These models are available as a resource to the community enabling the control of cellular heterogeneity in epigenetic studies of brain disorders performed on bulk cortex tissue.Engineering and Physical Sciences Research CouncilMedical Research CouncilAlzheimer's Research UKMedical Research Counci

    A New Timepiece: An Epigenetic Mitotic Clock

    Get PDF
    A new mitotic clock and mathematical approach that incorporates DNA methylation biology common among human cell types provides a new tool for cancer epigenetics research

    A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies.

    Get PDF
    BACKGROUND: Intra-sample cellular heterogeneity presents numerous challenges to the identification of biomarkers in large Epigenome-Wide Association Studies (EWAS). While a number of reference-based deconvolution algorithms have emerged, their potential remains underexplored and a comparative evaluation of these algorithms beyond tissues such as blood is still lacking. RESULTS: Here we present a novel framework for reference-based inference, which leverages cell-type specific DNAse Hypersensitive Site (DHS) information from the NIH Epigenomics Roadmap to construct an improved reference DNA methylation database. We show that this leads to a marginal but statistically significant improvement of cell-count estimates in whole blood as well as in mixtures involving epithelial cell-types. Using this framework we compare a widely used state-of-the-art reference-based algorithm (called constrained projection) to two non-constrained approaches including CIBERSORT and a method based on robust partial correlations. We conclude that the widely-used constrained projection technique may not always be optimal. Instead, we find that the method based on robust partial correlations is generally more robust across a range of different tissue types and for realistic noise levels. We call the combined algorithm which uses DHS data and robust partial correlations for inference, EpiDISH (Epigenetic Dissection of Intra-Sample Heterogeneity). Finally, we demonstrate the added value of EpiDISH in an EWAS of smoking. CONCLUSIONS: Estimating cell-type fractions and subsequent inference in EWAS may benefit from the use of non-constrained reference-based cell-type deconvolution methods

    Assessing the Role of DNA Methylation-Derived Neutrophil-to-Lymphocyte Ratio in Rheumatoid Arthritis

    Get PDF
    Rheumatoid arthritis (RA) is a disease of chronic systemic inflammation (SI). In the present study, we used four datasets to explore whether methylation-derived neutrophil-to-lymphocyte ratio (mdNLR) might be a marker of SI in new onset, untreated, and treated prevalent RA cases and/or a marker of treatment response to the tumour necrosis factor inhibitor (TNFi) etanercept. mdNLR was associated with increased odds of being a new onset RA case (OR= 2.32, 95% CI = 1.95-2.80, P < 2 x 10(-16)) and performed better in distinguishing new onset RA cases from controls compared to covariates: age, gender, and smoking status. In untreated preclinical RA cases and controls, mdNLR at baseline was associated with diagnosis of RA in later life after adjusting for batch (OR= 4.30, 95% CI = 1.52-21.71, P = 0.029) although no association was observed before batch correction. When prevalent RA cases were treated, there was no association with mdNLR in samples before and after batch correction (OR = 0.34, 95% CI = 0.05-1.82, P = 0.23), and mdNLR was not associated with treatment response to etanercept (OR = 1.10, 95% CI = 0.75-1.68, P = 0.64). Our results indicate that SI measured by DNA methylation data is indicative of the recent onset of RA. Although preclinical RA was associated with mdNLR, there was no difference in the mean mdNLR between preclinical RA cases and controls. mdNLR was not associated with RA case status if treatment for RA has commenced, and it is not associated with treatment response. In the future, mdNLR estimates may be used as a valuable research tool to reliably estimate SI in the absence of freshly collected blood samples
    corecore