25 research outputs found
BayesCCE: a Bayesian framework for estimating cell-type composition from DNA methylation without the need for methylation reference.
We introduce a Bayesian semi-supervised method for estimating cell counts from DNA methylation by leveraging an easily obtainable prior knowledge on the cell-type composition distribution of the studied tissue. We show mathematically and empirically that alternative methods which attempt to infer cell counts without methylation reference only capture linear combinations of cell counts rather than provide one component per cell type. Our approach allows the construction of components such that each component corresponds to a single cell type, and provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before
Recommended from our members
Cell-type-specific resolution epigenetics without the need for cell sorting or single-cell biology.
High costs and technical limitations of cell sorting and single-cell techniques currently restrict the collection of large-scale, cell-type-specific DNA methylation data. This, in turn, impedes our ability to tackle key biological questions that pertain to variation within a population, such as identification of disease-associated genes at a cell-type-specific resolution. Here, we show mathematically and empirically that cell-type-specific methylation levels of an individual can be learned from its tissue-level bulk data, conceptually emulating the case where the individual has been profiled with a single-cell resolution and then signals were aggregated in each cell population separately. Provided with this unprecedented way to perform powerful large-scale epigenetic studies with cell-type-specific resolution, we revisit previous studies with tissue-level bulk methylation and reveal novel associations with leukocyte composition in blood and with rheumatoid arthritis. For the latter, we further show consistency with validation data collected from sorted leukocyte sub-types
Rheumatoid Arthritis Naive T Cells Share Hypermethylation Sites With Synoviocytes.
ObjectiveTo determine whether differentially methylated CpGs in synovium-derived fibroblast-like synoviocytes (FLS) of patients with rheumatoid arthritis (RA) were also differentially methylated in RA peripheral blood (PB) samples.MethodsFor this study, 371 genome-wide DNA methylation profiles were measured using Illumina HumanMethylation450 BeadChips in PB samples from 63 patients with RA and 31 unaffected control subjects, specifically in the cell subsets of CD14+ monocytes, CD19+ B cells, CD4+ memory T cells, and CD4+ naive T cells.ResultsOf 5,532 hypermethylated FLS candidate CpGs, 1,056 were hypermethylated in CD4+ naive T cells from RA PB compared to control PB. In analyses of a second set of CpG candidates based on single-nucleotide polymorphisms from a genome-wide association study of RA, 1 significantly hypermethylated CpG in CD4+ memory T cells and 18 significant CpGs (6 hypomethylated, 12 hypermethylated) in CD4+ naive T cells were found. A prediction score based on the hypermethylated FLS candidates had an area under the curve of 0.73 for association with RA case status, which compared favorably to the association of RA with the HLA-DRB1 shared epitope risk allele and with a validated RA genetic risk score.ConclusionFLS-representative DNA methylation signatures derived from the PB may prove to be valuable biomarkers for the risk of RA or for disease status
Recommended from our members
Accurate estimation of cell composition in bulk expression through robust integration of single-cell information
We present Bisque, a tool for estimating cell type proportions in bulk expression. Bisque implements a regression-based approach that utilizes single-cell RNA-seq (scRNA-seq) or single-nucleus RNA-seq (snRNA-seq) data to generate a reference expression profile and learn gene-specific bulk expression transformations to robustly decompose RNA-seq data. These transformations significantly improve decomposition performance compared to existing methods when there is significant technical variation in the generation of the reference profile and observed bulk expression. Importantly, compared to existing methods, our approach is extremely efficient, making it suitable for the analysis of large genomic datasets that are becoming ubiquitous. When applied to subcutaneous adipose and dorsolateral prefrontal cortex expression datasets with both bulk RNA-seq and snRNA-seq data, Bisque replicates previously reported associations between cell type proportions and measured phenotypes across abundant and rare cell types. We further propose an additional mode of operation that merely requires a set of known marker genes.Peer reviewe
Recommended from our members
Enhancing droplet-based single-nucleus RNA-seq resolution using the semi-supervised machine learning classifier DIEM
Single-nucleus RNA sequencing (snRNA-seq) measures gene expression in individual nuclei instead of cells, allowing for unbiased cell type characterization in solid tissues. We observe that snRNA-seq is commonly subject to contamination by high amounts of ambient RNA, which can lead to biased downstream analyses, such as identification of spurious cell types if overlooked. We present a novel approach to quantify contamination and filter droplets in snRNA-seq experiments, called Debris Identification using Expectation Maximization (DIEM). Our likelihood-based approach models the gene expression distribution of debris and cell types, which are estimated using EM. We evaluated DIEM using three snRNA-seq data sets: (1) human differentiating preadipocytes in vitro, (2) fresh mouse brain tissue, and (3) human frozen adipose tissue (AT) from six individuals. All three data sets showed evidence of extranuclear RNA contamination, and we observed that existing methods fail to account for contaminated droplets and led to spurious cell types. When compared to filtering using these state of the art methods, DIEM better removed droplets containing high levels of extranuclear RNA and led to higher quality clusters. Although DIEM was designed for snRNA-seq, our clustering strategy also successfully filtered single-cell RNA-seq data. To conclude, our novel method DIEM removes debris-contaminated droplets from single-cell-based data fast and effectively, leading to cleaner downstream analysis. Our code is freely available for use at https://github.com/marcalva/diem.Peer reviewe
The causal effect of obesity on prediabetes and insulin resistance reveals the important role of adipose tissue in insulin resistance
Reverse causality has made it difficult to establish the causal directions between obesity and prediabetes and obesity and insulin resistance. To disentangle whether obesity causally drives prediabetes and insulin resistance already in non-diabetic individuals, we utilized the UK Biobank and METSIM cohort to perform a Mendelian randomization (MR) analyses in the non-diabetic individuals. Our results suggest that both prediabetes and systemic insulin resistance are caused by obesity (p = 1.2x10(-3)and p = 3.1x10(-24)). As obesity reflects the amount of body fat, we next studied how adipose tissue affects insulin resistance. We performed both bulk RNA-sequencing and single nucleus RNA sequencing on frozen human subcutaneous adipose biopsies to assess adipose cell-type heterogeneity and mitochondrial (MT) gene expression in insulin resistance. We discovered that the adipose MT gene expression and body fat percent are both independently associated with insulin resistance (p Author summary Obesity is a global health epidemic predisposing to type 2 diabetes (T2D) and other cardiometabolic disorders. Previous studies have shown that obesity has a causal effect on T2D; however, it remains unknown whether obesity causes prediabetes and insulin resistance already in non-diabetic individuals. By utilizing almost half a million individuals from the UK Biobank and the Finnish METSIM cohort, we identified a significant causal effect of obesity on prediabetes and insulin resistance among the non-diabetic individuals. Next, we investigated the role of subcutaneous adipose tissue in these obesogenic effects. We discovered that the adipose mitochondrial gene expression and body fat percent are independently associated with insulin resistance after adjusting for the tissue heterogeneity. For the latter, we estimated the adipose cell type proportions by utilizing single-nucleus RNA sequencing of frozen adipose tissue biopsies. Moreover, we established a prediction model to estimate insulin resistance using body fat percent and adipose RNA-sequencing data, which enlightens the importance of adipose tissue in insulin resistance and provides a helpful tool to impute the insulin resistance for existing adipose RNA-sequencing cohorts. Overall, we discover the potential causal effect of obesity on prediabetes and insulin resistance and the key role of adipose tissue in insulin resistance.Peer reviewe
Genome-wide methylation data mirror ancestry information
Background: Genetic data are known to harbor information about human demographics, and genotyping data are commonly used for capturing ancestry information by leveraging genome-wide differences between populations. In contrast, it is not clear to what extent population structure is captured by whole-genome DNA methylation data. Results: We demonstrate, using three large-cohort 450K methylation array data sets, that ancestry information signal is mirrored in genome-wide DNA methylation data and that it can be further isolated more effectively by leveraging the correlation structure of CpGs with cis-located SNPs. Based on these insights, we propose a method, EPISTRUCTURE, for the inference of ancestry from methylation data, without the need for genotype data. Conclusions: EPISTRUCTURE can be used to infer ancestry information of individuals based on their methylation data in the absence of corresponding genetic data. Although genetic data are often collected in epigenetic studies of large cohorts, these are typically not made publicly available, making the application of EPISTRUCTURE especially useful for anyone working on public data. Implementation of EPISTRUCTURE is available in GLINT, our recently released toolset for DNA methylation analysis at: http://glint-epigenetics.readthedocs.io